1 Notes

This report was generated on 2021-08-12 16:28:11. R version: 4.1.1 on x86_64-pc-linux-gnu. For this report, CRAN packages as of 2021-06-01 were used.

1.1 R-Script & data

The preprocessing and analysis of the data was conducted in the R project for statistical computing. The RMarkdown script used to generate this document and all the resulting data can be downloaded under this link. Through executing main.Rmd, the herein described process can be reproduced and this document can be generated. In the course of this, data from the folder input will be processed and results will be written to output. The html on-line version of the analysis can be accessed through this link.

1.2 GitHub

The code for the herein described process can also be freely downloaded from https://github.com/fernandomillanvillalobos/datavizR.

1.3 License

1.4 Data description of output files

1.4.0.1 abc.csv (Example)

Attribute Type Description
a Numeric
b Numeric
c Numeric

1.4.0.2 xyz.csv

2 Set up

## [1] "package package:rstudioapi detached"
## [1] "package package:knitr detached"

2.1 Define packages

# from https://mran.revolutionanalytics.com/web/packages/\
# checkpoint/vignettes/using-checkpoint-with-knitr.html
# if you don't need a package, remove it from here (commenting not sufficient)
# tidyverse: see https://blog.rstudio.org/2016/09/15/tidyverse-1-0-0/
cat("
library(rstudioapi)
library(tidyverse, warn.conflicts = FALSE) # ggplot2, dplyr, tidyr, readr, purrr, tibble, magrittr, readxl
library(scales) # scales for ggplot2
library(lintr) # code linting
library(rmarkdown)
library(cowplot) # theme
library(extrafont)
library(ggrepel) # text labels
library(gapminder) #data sets 
library(socviz) # book Data Visualization: A Practical...
library(RColorBrewer)
library(ggforce)
library(dichromat) # palettes for color-blind
library(ggridges) # density ridges plots
library(viridis) # colors
library(palmerpenguins)
library(lubridate)
library(ggforce)
library(ggthemes) # set of themes
library(nycflights13) # ds example
library(broom) # cleans model output 
library(glue) # for easy text formatting 
library(ggiraph) # for interaction 
library(hexbin)
library(patchwork)
library(distributional) # for dist_normal() 
library(psych)
library(introdataviz)
library(ggalluvial)
library(ggdist)
library(ds4psy) # book Data Science for Psychologists
library(gganimate)",
file = "manifest.R")

2.2 Install packages

# if checkpoint is not yet installed, install it (for people using this
# system for the first time)
if (!require(checkpoint)) {
  if (!require(devtools)) {
    install.packages("devtools", repos = "http://cran.us.r-project.org")
    require(devtools)
  }
  devtools::install_github("RevolutionAnalytics/checkpoint",
                           ref = "v0.3.2", # could be adapted later,
                           # as of now (beginning of July 2017
                           # this is the current release on CRAN)
                           repos = "http://cran.us.r-project.org")
  require(checkpoint)
}
# nolint start
if (!dir.exists("~/.checkpoint")) {
  dir.create("~/.checkpoint")
}
# nolint end
# install packages for the specified CRAN snapshot date
checkpoint(snapshot_date = package_date,
           project = path_to_wd,
           verbose = T,
           scanForPackages = T,
           use.knitr = F,
           R.version = r_version)
rm(package_date)

2.3 Load packages

source("manifest.R")
unlink("manifest.R")
sessionInfo()
## R version 4.1.1 (2021-08-10)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.2 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] gganimate_1.0.7         ds4psy_0.7.0            ggdist_3.0.0           
##  [4] ggalluvial_0.12.3       introdataviz_0.0.0.9002 psych_2.1.6            
##  [7] distributional_0.2.2    patchwork_1.1.1         hexbin_1.28.2          
## [10] ggiraph_0.7.10          glue_1.4.2              broom_0.7.9            
## [13] nycflights13_1.0.2      ggthemes_4.2.4          lubridate_1.7.10       
## [16] palmerpenguins_0.1.0    viridis_0.6.1           viridisLite_0.4.0      
## [19] ggridges_0.5.3          dichromat_2.0-0         ggforce_0.3.3          
## [22] RColorBrewer_1.1-2      socviz_1.2              gapminder_0.3.0        
## [25] ggrepel_0.9.1           extrafont_0.17          cowplot_1.1.1          
## [28] rmarkdown_2.10          lintr_2.0.1             scales_1.1.1           
## [31] forcats_0.5.1           stringr_1.4.0           dplyr_1.0.7            
## [34] purrr_0.3.4             readr_2.0.0             tidyr_1.1.3            
## [37] tibble_3.1.3            ggplot2_3.3.5           tidyverse_1.3.1        
## [40] rstudioapi_0.13         checkpoint_1.0.0       
## 
## loaded via a namespace (and not attached):
##  [1] colorspace_2.0-2  ellipsis_0.3.2    rprojroot_2.0.2   fs_1.5.0         
##  [5] farver_2.1.0      remotes_2.4.0     fansi_0.5.0       xml2_1.3.2       
##  [9] mnormt_2.0.2      knitr_1.33        polyclip_1.10-0   jsonlite_1.7.2   
## [13] Rttf2pt1_1.3.8    dbplyr_2.1.1      compiler_4.1.1    httr_1.4.2       
## [17] backports_1.2.1   assertthat_0.2.1  lazyeval_0.2.2    cli_3.0.1        
## [21] tweenr_1.0.2      prettyunits_1.1.1 htmltools_0.5.1.1 tools_4.1.1      
## [25] gtable_0.3.0      Rcpp_1.0.7        cellranger_1.1.0  jquerylib_0.1.4  
## [29] vctrs_0.3.8       nlme_3.1-152      extrafontdb_1.0   xfun_0.25        
## [33] ps_1.6.0          rvest_1.0.1       lifecycle_1.0.0   MASS_7.3-54      
## [37] unikn_0.4.0       hms_1.1.0         rex_1.2.0         parallel_4.1.1   
## [41] yaml_2.2.1        gridExtra_2.3     sass_0.4.0        stringi_1.7.3    
## [45] desc_1.3.0        cyclocomp_1.1.0   rlang_0.4.11      pkgconfig_2.0.3  
## [49] systemfonts_1.0.2 evaluate_0.14     lattice_0.20-44   htmlwidgets_1.5.3
## [53] processx_3.5.2    tidyselect_1.1.1  plyr_1.8.6        magrittr_2.0.1   
## [57] R6_2.5.0          generics_0.1.0    DBI_1.1.1         pillar_1.6.2     
## [61] haven_2.4.3       withr_2.4.2       modelr_0.1.8      crayon_1.4.1     
## [65] uuid_0.1-4        utf8_1.2.2        tmvnsim_1.0-2     tzdb_0.1.2       
## [69] progress_1.2.2    grid_4.1.1        readxl_1.3.1      callr_3.7.0      
## [73] reprex_2.0.1      digest_0.6.27     munsell_0.5.0     bslib_0.2.5.1

2.4 Load additional scripts

# if you want to outsource logic to other script files, see README for 
# further information
# Load all visualizations functions as separate scripts
knitr::read_chunk("scripts/dviz.supp.R")
source("scripts/dviz.supp.R")
knitr::read_chunk("scripts/themes.R")
source("scripts/themes.R")
knitr::read_chunk("scripts/plot_grid.R")
source("scripts/plot_grid.R")
knitr::read_chunk("scripts/align_legend.R")
source("scripts/align_legend.R")
knitr::read_chunk("scripts/label_log10.R")
source("scripts/label_log10.R")
knitr::read_chunk("scripts/outliers.R")
source("scripts/outliers.R")

3 Data Visualization: A Practical Introduction (Kieran Healy)

3.1 Show the Right Numbers

3.1.1 Grouped Data and the “Group” Aesthetic

The group aesthetic is usually only needed when the grouping information you need to tell ggplot about is not built into the variables being mapped.

p <- ggplot(data = gapminder,
            mapping = aes(x = year,
                          y = gdpPercap))
p + geom_line(aes(group=country))

3.1.2 Facet to Make Small Multiples

The facet_wrap() function can take a series of arguments, but the most important is the first one, which is specified using R’s “formula” syntax, which uses the tilde character, ~. Facets are usually a one-sided formula. Most of the time you will just want a single variable on the right side of the formula.

p <- ggplot(data = gapminder,
            mapping = aes(x = year,
                          y = gdpPercap))
p + geom_line(aes(group = country)) + facet_wrap(~ continent)

p <- ggplot(data = gapminder, mapping = aes(x = year, y = gdpPercap))
p + geom_line(color="gray70", aes(group = country)) +
    geom_smooth(size = 1.1, method = "loess", se = FALSE) +
    scale_y_log10(labels=scales::dollar) +
    facet_wrap(~ continent, ncol = 5) +
    labs(x = "Year",
         y = "GDP per capita",
         title = "GDP per capita on Five Continents")

The facet_wrap() function is best used when you want a series of small multiples based on a single categorical variable. Your panels will be laid out in order and then wrapped into a grid. If you wish you can specify the number of rows or the number of columns in the resulting layout. Facets can be more complex than this. For instance, you might want to cross-classify some data by two categorical variables. In that case you should try facet_grid() instead. This function will lay out your plot in a true two-dimensional arrangement, instead of a series of panels wrapped into a grid.

p <- ggplot(data = gss_sm,
            mapping = aes(x = age, y = childs))
p + geom_point(alpha = 0.2) +
    geom_smooth() +
    facet_grid(sex ~ race)

Multipanel layouts of this kind are especially effective when used to summarize continuous variation(as in a scatterplot) across two or more categorical variables, with the categories (and hence the panels) ordered in some sensible way.

3.1.3 Geoms Can Transform Data

Some geoms plot our data directly on the figure, as is the case with geom_point(), which takes variables designated as x and y and plots the points on a grid. But other geoms clearly do more work on the data before it gets plotted. Every geom_ function has an associated stat_ function that it uses by default. The reverse is also the case: every stat_ function has an associated geom_ function that it will plot by default if you ask it to. Sometimes the calculations being done by the stat_ functions that work together with the geom_ functions might not be immediately obvious. When ggplot calculates the count or the proportion, it returns temporary variables that we can use as mappings in our plots.

p <- ggplot(data = gss_sm, mapping = aes(x = bigregion)) 
p + geom_bar() # geom_bar called the default stat_ function associated with it, stat_count().

# We no longer have a count on the y-axis, but the proportions of the bars all have a value of 1, so all the bars are the same height. We want them to sum to 1, so that we get the number of observations per continent as a proportion of the total number of observations. This is a grouping issue again. In a sense, it’s the reverse of the earlier grouping problem we faced when we needed to tell ggplot that our yearly data was grouped by country.

p <- ggplot(data = gss_sm,
            mapping = aes(x = bigregion))
p + geom_bar(mapping = aes(y = ..prop..))

# In this case, we need to tell ggplot to ignore the x-categories when calculating denominator of the proportion, and use the total number observations instead. To do so we specify group = 1 inside the aes() call. The value of 1 is just a kind of “dummy group” that tells ggplot to use the whole dataset when establishing the denominator for its prop calculations.

p <- ggplot(data = gss_sm,
            mapping = aes(x = bigregion))
p + geom_bar(mapping = aes(y = ..prop.., group = 1)) # 1 is a dummy group

# Another example
p <- ggplot(data = gss_sm,
            mapping = aes(x = religion, fill = religion))
p + geom_bar() + guides(fill = FALSE) #  If we set guides(fill = FALSE), the legend is removed

3.1.4 Frequency Plots the Slightly Awkward Way

A more appropriate use of the fill aesthetic with geom_bar() is to cross-classify two categorical variables. This is the graphical equivalent of a frequency table of counts or proportions. When we cross-classify categories in bar charts, there are several ways to display the results. With geom_bar() the output is controlled by the position argument.

p <- ggplot(data = gss_sm,
            mapping = aes(x = bigregion, fill = religion))
p + geom_bar() # The default output of geom_bar() is a stacked bar chart

# An alternative choice is to set the position argument to "fill".
p <- ggplot(data = gss_sm,
            mapping = aes(x = bigregion, fill = religion))
p + geom_bar(position = "fill") # the bars are all the same height 

# When we just wanted the overall proportions for one variable, we mapped group = 1 to tell ggplot to calculate the proportions with respect to the overall N.
p <- ggplot(data = gss_sm,
            mapping = aes(x = bigregion, fill = religion))
p + geom_bar(position = "dodge",
             mapping = aes(y = ..prop.., group = religion))

# We can ask ggplot to give us a proportional bar chart of religious affiliation, and then facet that by region
p <- ggplot(data = gss_sm,
            mapping = aes(x = religion))
p + geom_bar(position = "dodge",
             mapping = aes(y = ..prop.., group = bigregion)) +
    facet_wrap(~ bigregion, ncol = 1)

3.1.5 Histograms and density plots

A histogram is a way of summarizing a continuous variable by chopping it up into segments or “bins” and counting how many observations are found within each bin. In a bar chart, the categories are given to us going in (e.g., regions of the country, or religious affiliation). With a histogram, we have to decide how finely to bin the data. As with the bar charts, a newly-calculated variable, count, appears on the x-axis.

While histograms summarize single variables, it’s also possible to use several at once to compare distributions. We can facet histograms by some variable of interest.

# By default, the geom_histogram() function will choose a bin size for us based on a rule of thumb.
p <- ggplot(data = midwest,
            mapping = aes(x = area))
p + geom_histogram()

# selecting another bin size
p <- ggplot(data = midwest,
            mapping = aes(x = area))
p + geom_histogram(bins = 10)

oh_wi <- c("OH", "WI")
# subset the data
p <- ggplot(data = subset(midwest, subset = state %in% oh_wi), # %in% operator is a convenient way to filter on more than one termin a variable
            mapping = aes(x = percollege, fill = state))
p + geom_histogram(alpha = 0.4, bins = 20)

# When working with a continuous variable, an alternative to binning the data and making a histogram is to calculate a kernel density estimate of the underlying distribution.
p <- ggplot(data = midwest,
            mapping = aes(x = area, fill = state, color = state))
p + geom_density(alpha = 0.3)

# For geom_density(), the stat_density() function can return its default ..density.. statistic, or ..scaled.., which will give a proportional density estimate. It can also return a statistic called ..count.., which is the density times the number of points. This can be used in stacked density plots.
p <- ggplot(data = subset(midwest, subset = state %in% oh_wi),
            mapping = aes(x = area, fill = state, color = state))
p + geom_density(alpha = 0.3, mapping = (aes(y = ..scaled..)))

3.1.6 Avoid transformations when necessary

Often our data is, in effect, already a summary table. This can happen when we have computed a table of marginal frequencies or percentages from the original data. Because we are working directly with percentage values in a summary table,we no longer have any need for ggplot to count up values for us or perform any other calculations. That is, we do not need the services of any stat_ functions. We can tell geom_bar() not to do any work on the variable before plotting it. To do this we say stat = ‘identity’ in the geom_bar() call.

p <- ggplot(data = titanic,
            mapping = aes(x = fate, y = percent, fill = sex))
p + geom_bar(position = "dodge", stat = "identity") + theme(legend.position = "top")

# For convenience ggplot also provides a related geom, geom_col(), which has exactly the same effect but assumes that stat = "identity".
# The position argument in geom_bar() and geom_col() can also take the value of "identity". Just as stat = "identity" means “don’t do any summary calculations”, position = "identity" means “just plot the values as given”.
p <- ggplot(data = oecd_sum,
            mapping = aes(x = year, y = diff, fill = hi_lo))
p + geom_col() + guides(fill = FALSE) +
  labs(x = NULL, y = "Difference in Years",
       title = "The US Life Expectancy Gap",
       subtitle = "Difference between US and OECD
                   average life expectancies, 1960-2015",
       caption = "Data: OECD. After a chart by Christopher Ingraham,
                  Washington Post, December 27th 2017.")

3.2 Graph Tables, Add Labels, Make Notes

3.2.1 Use Pipes to Summarize Data

letting the geoms (and their stat_ functions) do the work can sometimes get a little confusing. It is too easy to lose track of whether one has calculated row margins, column margins, or overall relative frequencies. A better strategy is to calculate the frequency table you want first and then plot that table. This has the benefit of allowing you do to some quick sanity checks on your tables, to make sure you haven’t made any errors.

In addition to making our code easier to read, it lets us more easily perform sanity checks on our results, so that we are sure we have grouped and summarized things in the right order.

rel_by_region <- gss_sm %>%
    group_by(bigregion, religion) %>% # from outermost to innermost 
    summarize(N = n()) %>%
    mutate(freq = N / sum(N), # calculate relative proportion 
           pct = round((freq*100), 0)) # calculate percentage

# Checking pct
rel_by_region %>% 
  group_by(bigregion) %>%
  summarize(total = sum(pct))
bigregion total
Northeast 100
Midwest 101
South 100
West 101
# As a rule, dodged charts can be more cleanly expressed as faceted plots. Faceting removes the need for a legend and thus makes the chart simpler to read.

p <- ggplot(rel_by_region, aes(x = religion, y = pct, fill = religion))
p + geom_col(position = "dodge2") +
    labs(x = NULL, y = "Percent", fill = "Religion") +
    guides(fill = FALSE) + 
    coord_flip() + # flip the axis 
    facet_grid(~ bigregion)

3.2.2 Continuous Variables by Group or Category

The variables specified in group_by() are retained in the new data frame, the variables created with summarize() are added, and all the other variables in the original data are dropped.

We generally want our plots to present data in some meaningful order. The reorder() function will do this for us. It takes two required arguments. The first is the categorical variable or factor that we want to reorder. In this case, that’s country. The second is the variable we want to reorder it by. Here that is the donation rate, donors. The third and optional argument to reorder() is the function you want to use as a summary statistic. If you give reorder() only the first two required arguments, then by default it will reorder the categories of your first variable by the mean value of the second. You can use any sensible function you like to reorder the categorical variable (e.g., median, or sd).

organdata %>% select(1:6) %>% sample_n(size = 10) # pick a sample 
country year donors pop pop_dens gdp
Finland 1991-01-01 16.80 5014 1.4827739 17281
Australia 1994-01-01 10.25 17855 0.2306484 19849
Switzerland 1998-01-01 15.43 7110 17.2196658 28733
Belgium 2001-01-01 22.20 10287 31.0785498 27113
Belgium 1992-01-01 20.60 10045 30.3474320 19444
Sweden 1991-01-01 16.40 8617 1.9150591 19000
Netherlands 1992-01-01 15.10 15184 36.5615218 19285
Spain NA NA NA NA NA
Austria 1996-01-01 24.70 7959 9.4908180 23798
Sweden 1993-01-01 15.20 8719 1.9377278 19063
# dotplot
p <- ggplot(data = organdata, mapping = aes(x = year, y = donors)) 
p + geom_point()

# lineplot
p <- ggplot(data = organdata,
            mapping = aes(x = year, y = donors))
p + geom_line(aes(group = country)) + facet_wrap(~ country)

# boxplot
p <- ggplot(data = organdata,
            mapping = aes(x = country, y = donors))
p + geom_boxplot() +
  coord_flip()

# boxplot reordered
p <- ggplot(data = organdata,
            mapping = aes(x = reorder(country, donors, na.rm = TRUE),
                          y = donors))
p + geom_boxplot() +
    labs(x=NULL) +
    coord_flip()

# violin plot reordered and filled
p <- ggplot(data = organdata,
            mapping = aes(x = reorder(country, donors, na.rm=TRUE),
                          y = donors, fill = world))
p + geom_violin() + labs(x=NULL) +
    coord_flip() + theme(legend.position = "top")

# dotplot reordered and colored
p <- ggplot(data = organdata,
            mapping = aes(x = reorder(country, donors, na.rm=TRUE),
                          y = donors, color = world))
p + geom_point() + labs(x=NULL) +
    coord_flip() + theme(legend.position = "top")

# dotplot jittered, reordered and colored
p <- ggplot(data = organdata,
            mapping = aes(x = reorder(country, donors, na.rm=TRUE),
                          y = donors, color = world))
p + geom_jitter(position = position_jitter(width=0.15)) + # to avoid overplotting 
    labs(x=NULL) + coord_flip() + theme(legend.position = "top")

When we want to summarize a categorical variable that just has one point per category, we should use this approach as well. The result will be a Cleveland dotplot, a simple and extremely effective method of presenting data that is usually better than either a bar chart or a table. Cleveland dotplots are generally preferred to bar or column charts. When making them, put the categories on the y-axis and order them in the way that is most relevant to the numerical summary you are providing. This sort of plot is also an excellent way to summarizemodel results or any data with with error ranges.

by_country <- organdata %>% 
  group_by(consent_law, country) %>%
  summarize(donors_mean = mean(donors, na.rm = TRUE),
              donors_sd = sd(donors, na.rm = TRUE),
              gdp_mean = mean(gdp, na.rm = TRUE),
              health_mean = mean(health, na.rm = TRUE),
              roads_mean = mean(roads, na.rm = TRUE),
              cerebvas_mean = mean(cerebvas, na.rm = TRUE))

# Doing the same in another better way
by_country <- organdata %>% 
  group_by(consent_law, country) %>%
  summarize_if(is.numeric, list(mean, sd), na.rm = TRUE) %>% # list instead funs
  ungroup()
by_country # vars are named using the original variable, with the function’s name appended
consent_law country donors_fn1 pop_fn1 pop_dens_fn1 gdp_fn1 gdp_lag_fn1 health_fn1 health_lag_fn1 pubhealth_fn1 roads_fn1 cerebvas_fn1 assault_fn1 external_fn1 txp_pop_fn1 donors_fn2 pop_fn2 pop_dens_fn2 gdp_fn2 gdp_lag_fn2 health_fn2 health_lag_fn2 pubhealth_fn2 roads_fn2 cerebvas_fn2 assault_fn2 external_fn2 txp_pop_fn2
Informed Australia 10.63500 18317.923 0.2366284 22178.54 21779.43 1957.500 1848.214 5.676923 104.87573 557.6923 16.769231 393.0000 0.8751195 1.1428075 830.89394 0.0107334 3958.506 4085.883 481.6276 460.0962 0.4245661 14.327316 82.69863 1.8327505 26.76440 0.0396299
Informed Canada 13.96667 29607.923 0.2969520 23711.08 23353.07 2271.929 2163.429 6.676923 109.26011 422.3846 16.769231 410.6154 1.0485954 0.7511607 1192.74791 0.0119626 3965.847 4038.868 420.5751 379.1317 0.3919053 17.679258 38.46544 2.4547181 40.76873 0.0424701
Informed Denmark 13.09167 5257.154 12.2004034 23722.31 23275.00 2054.071 1973.429 6.984615 101.63635 640.6923 12.230769 532.3846 0.7610332 1.4681208 80.60691 0.1870664 3895.685 4100.016 371.3614 357.7605 0.1519109 12.421001 46.27163 2.1273554 33.60441 0.0116730
Informed Germany 13.04167 80254.846 22.4784601 22163.23 21938.36 2348.750 2256.250 8.142308 112.78873 706.7692 9.538462 391.3077 0.5508479 0.6111960 5157.63285 1.4445937 2501.344 2546.250 377.2275 383.7725 0.6244485 25.911094 126.03515 1.6641006 56.97424 0.0437534
Informed Ireland 19.79167 3673.615 5.2278574 20824.38 20153.64 1479.929 1340.786 4.876923 117.77424 704.6923 8.538462 394.0000 0.8175829 2.4784373 131.59378 0.1872688 6669.580 6881.862 565.5526 482.4379 0.2862221 10.761587 87.20320 1.5607362 18.85471 0.0287060
Informed Netherlands 13.65833 15547.692 37.4372557 23013.15 22553.64 1992.786 1884.857 5.700000 76.09357 584.9231 11.153846 285.8462 0.7078763 1.5518074 372.96434 0.8980600 3769.961 4009.418 417.0621 377.0521 0.3082207 9.930020 52.23259 1.3445045 18.62725 0.0169759
Informed United Kingdom 13.49167 58186.692 23.9540127 21359.31 20962.50 1561.214 1463.500 5.761539 67.92936 707.9231 8.923077 287.9231 0.7047037 0.7751344 626.34567 0.2578509 3929.497 4056.793 405.0679 374.9447 0.3548203 10.467402 93.43577 1.6563785 15.14079 0.0075729
Informed United States 19.98167 269329.769 2.7970428 29211.77 28699.43 3988.286 3760.429 5.776923 155.16783 444.3846 80.384615 530.0000 1.0193884 1.3253667 12544.86916 0.1302809 4571.160 4791.979 864.9320 807.5220 0.4621577 8.353810 16.04960 17.8724111 32.15587 0.0476772
Presumed Austria 23.52500 7927.308 9.4530261 23875.85 23415.07 1875.357 1803.143 5.492308 149.86541 768.8462 10.923077 506.8462 0.6308434 2.4159037 109.19507 0.1302112 3342.889 3645.228 296.8980 316.8549 0.2660249 30.281692 119.64242 2.3259958 62.13540 0.0088249
Presumed Belgium 21.90000 10153.308 30.6746456 22499.62 22095.93 1958.357 1862.429 6.188889 154.69504 593.8462 14.307692 541.6154 0.7880048 1.9357874 109.16378 0.3297999 3170.584 3400.119 405.1142 403.2977 0.2027588 20.556129 55.24920 3.6602508 22.87116 0.0084838
Presumed Finland 18.44167 5111.846 1.5117096 21018.92 20763.00 1615.286 1559.786 5.861538 93.57447 771.3846 27.461538 721.9231 0.5869704 1.5264089 68.62561 0.0202944 3667.866 3651.757 202.9780 181.5384 0.7274825 19.007381 136.47865 3.2304640 66.87234 0.0079285
Presumed France 16.75833 58055.692 10.5268708 22602.85 22210.71 2159.643 2066.429 7.076923 156.15327 432.6923 8.923077 602.6923 0.7063585 1.5974174 851.44929 0.1543879 3260.346 3459.035 397.2170 371.9650 0.2241794 20.063260 54.53345 1.6563785 46.63758 0.0103484
Presumed Italy 11.10000 57359.692 19.0348750 21554.15 21194.93 1757.000 1689.071 5.984615 121.94294 712.1538 14.923077 368.8462 0.4533029 4.2769998 424.68309 0.1409315 2781.309 2991.191 271.2379 264.0039 0.4469268 10.157891 118.03237 5.9646394 42.80936 0.0033601
Presumed Norway 15.44167 4386.231 1.3542765 26448.38 25769.36 2217.214 2125.429 6.830769 69.99821 661.6154 9.538462 423.3077 0.2280896 1.1090195 97.25752 0.0300289 6491.668 6734.625 606.2047 663.5244 0.3065524 6.676658 100.37890 2.4703368 43.53999 0.0050521
Presumed Spain 28.10833 39666.231 7.8393310 16933.00 16584.29 1289.071 1220.071 5.453846 161.11430 654.7692 8.692308 376.6154 0.7062535 4.9630376 950.90309 0.1879292 2888.343 3066.466 265.8960 269.3863 0.1450022 35.251103 138.65013 0.9473309 35.16281 0.0164238
Presumed Sweden 13.12500 8789.231 1.9533360 22415.46 22094.00 1951.357 1868.000 7.315385 72.34575 595.3077 11.153846 395.8462 0.6827600 1.7535030 113.62376 0.0252520 3213.468 3313.422 372.9790 329.3088 0.2303843 13.246920 49.68465 1.6756170 37.58733 0.0089211
Presumed Switzerland 14.18250 7036.846 17.0424949 27233.00 26931.29 2776.071 2655.643 5.423077 96.38543 423.5385 10.769231 488.2308 0.9953044 1.7090940 169.77330 0.4111729 2153.454 2356.923 475.6701 464.0117 0.6043645 21.701876 72.99956 3.5155333 96.19958 0.0242808
# Cleveland dotplot reordered and colored
p <- ggplot(data = by_country,
            mapping = aes(x = donors_fn1, y = reorder(country, donors_fn1),
                          color = consent_law))
p + geom_point(size = 3) +
    labs(x = "Donor Procurement Rate",
         y = "", color = "Consent Law") +
    theme(legend.position="top")

# Cleveland dotplot reordered, colored and faceted
p <- ggplot(data = by_country,
            mapping = aes(x = donors_fn1,
                          y = reorder(country, donors_fn1)))

p + geom_point(size=3) +
    facet_wrap(~ consent_law, scales = "free_y", ncol = 1) + # col arg to make panels appear on top of other and make y-scale free; where one axis is categorical, as here, we can free the categorical axis and leave the continuous one fixed 
    labs(x= "Donor Procurement Rate",
         y= "")

# Dot-and-whisker plot
p <- ggplot(data = by_country, mapping = aes(x = reorder(country,
              donors_fn1), y = donors_fn1))

p + geom_pointrange(mapping = aes(ymin = donors_fn1 - donors_fn2, # how us a point estimate and a range around it 
       ymax = donors_fn1 + donors_fn2)) +
     labs(x= "", y= "Donor Procurement Rate") + coord_flip()

3.2.3 Plot Text Directly

The ggrepel package provides geom_text_repel() and geom_label_repel(), two geoms that can pick out labels much more flexibly than the default geom_text(). The ggrepel package has several other useful geoms and options to aid with effectively plotting labels along with points. The performance of its labeling algorithm is consistently very good. For many purposes it will be a better first choice than geom_text().

elections_historic %>% select(2:7) 
year winner win_party ec_pct popular_pct popular_margin
1824 John Quincy Adams D.-R. 0.3218 0.3092 -0.1044
1828 Andrew Jackson Dem. 0.6820 0.5593 0.1225
1832 Andrew Jackson Dem. 0.7657 0.5474 0.1781
1836 Martin Van Buren Dem. 0.5782 0.5079 0.1420
1840 William Henry Harrison Whig 0.7959 0.5287 0.0605
1844 James Polk Dem. 0.6182 0.4954 0.0145
1848 Zachary Taylor Whig 0.5621 0.4728 0.0479
1852 Franklin Pierce Dem. 0.8581 0.5083 0.0695
1856 James Buchanan Dem. 0.5878 0.4529 0.1220
1860 Abraham Lincoln Rep. 0.5941 0.3965 0.1013
1864 Abraham Lincoln Rep. 0.9099 0.5503 0.1008
1868 Ulysses Grant Rep. 0.7279 0.5266 0.0532
1872 Ulysses Grant Rep. 0.8125 0.5558 0.1180
1876 Rutherford Hayes Rep. 0.5014 0.4792 -0.0300
1880 James Garfield Rep. 0.5799 0.4831 0.0009
1884 Grover Cleveland Dem. 0.5461 0.4885 0.0057
1888 Benjamin Harrison Rep. 0.5810 0.4780 -0.0830
1892 Grover Cleveland Dem. 0.6239 0.4602 0.0301
1896 William McKinley Rep. 0.6063 0.5102 0.0431
1900 William McKinley Rep. 0.6523 0.5164 0.0612
1904 Theodore Roosevelt Rep. 0.7059 0.5642 0.1883
1908 William Taft Rep. 0.6646 0.5157 0.0853
1912 Woodrow Wilson Dem. 0.8192 0.4184 0.1444
1916 Woodrow Wilson Dem. 0.5217 0.4924 0.0312
1920 Warren Harding Rep. 0.7608 0.6032 0.2617
1924 Calvin Coolidge Rep. 0.7194 0.5404 0.2522
1928 Herbert Hoover Rep. 0.8362 0.5821 0.1741
1932 Franklin Roosevelt Dem. 0.8889 0.5741 0.1776
1936 Franklin Roosevelt Dem. 0.9849 0.6080 0.2426
1940 Franklin Roosevelt Dem. 0.8456 0.5474 0.0996
1944 Franklin Roosevelt Dem. 0.8136 0.5339 0.0750
1948 Harry Truman Dem. 0.5706 0.4955 0.0448
1952 Dwight Eisenhower Rep. 0.8324 0.5518 0.1085
1956 Dwight Eisenhower Rep. 0.8606 0.5737 0.1540
1960 John Kennedy Dem. 0.5642 0.4972 0.0017
1964 Lyndon Johnson Dem. 0.9033 0.6105 0.2258
1968 Richard Nixon Rep. 0.5595 0.4342 0.0070
1972 Richard Nixon Rep. 0.9665 0.6067 0.2315
1976 Jimmy Carter Dem. 0.5520 0.5008 0.0206
1980 Ronald Reagan Rep. 0.9089 0.5075 0.0974
1984 Ronald Reagan Rep. 0.9758 0.5877 0.1821
1988 George H. W. Bush Rep. 0.7918 0.5337 0.0772
1992 Bill Clinton Dem. 0.6877 0.4301 0.0556
1996 Bill Clinton Dem. 0.7045 0.4923 0.0851
2000 George W. Bush Rep. 0.5037 0.4787 -0.0510
2004 George W. Bush Rep. 0.5316 0.5073 0.0246
2008 Barack Obama Dem. 0.6784 0.5293 0.0727
2012 Barack Obama Dem. 0.6171 0.5106 0.0386
2016 Donald Trump Rep. 0.5687 0.4625 -0.0175
p_title <- "Presidential Elections: Popular & Electoral College Margins"
p_subtitle <- "1824-2016"
p_caption <- "Data for 2016 are provisional."
x_label <- "Winner's share of Popular Vote"
y_label <- "Winner's share of Electoral College Votes"

p <- ggplot(elections_historic, aes(x = popular_pct, y = ec_pct,
                                    label = winner_label))

p + geom_hline(yintercept = 0.5, size = 1.4, color = "gray80") + # two new geoms, geom_hline() and geom_vline() to make the lines. see also geom_abline() geom that draws straight lines based on a supplied slope and intercept
    geom_vline(xintercept = 0.5, size = 1.4, color = "gray80") +
    geom_point() +
    geom_text_repel() +
    scale_x_continuous(labels = scales::percent) +
    scale_y_continuous(labels = scales::percent) +
    labs(x = x_label, y = y_label, title = p_title, subtitle = p_subtitle,
         caption = p_caption)

3.2.4 Label Outliers

Sometimes we want to pick out some points of interest in the data without labeling every single item. Alternatively, we can pick out specific points by creating a dummy variable in the data set just for this purpose.

p <- ggplot(data = by_country,
            mapping = aes(x = gdp_fn1, y = health_fn1))

# Using subset to filter the data
p + geom_point() +
    geom_text_repel(data = subset(by_country, gdp_fn1 > 25000),
                    mapping = aes(label = country))

p <- ggplot(data = by_country,
            mapping = aes(x = gdp_fn1, y = health_fn1))

p + geom_point() +
    geom_text_repel(data = subset(by_country,
                                  gdp_fn1 > 25000 | health_fn1 < 1500 |
                                  country %in% "Belgium"),
                    mapping = aes(label = country))

# Creating a dummy variable to subset the data
organdata$ind <- organdata$ccode %in% c("Ita", "Spa") &
                    organdata$year > 1998

p <- ggplot(data = organdata,
            mapping = aes(x = roads,
                          y = donors, color = ind))
p + geom_point() +
    geom_text_repel(data = subset(organdata, ind),
                    mapping = aes(label = ccode)) +
    guides(label = FALSE, color = FALSE)

3.2.5 Write and draw in the plot area

Sometimes we want to annotate the figure directly.We use annotate() for this purpose. We will tell annotate() to use a text geom temporarily taking advantage of their features in order to place something on the plot. The annotate() function can work with other geoms, too. The most obvious use-case is putting arbitrary text on the plot.

p <- ggplot(data = organdata, mapping = aes(x = roads, y = donors))
p + geom_point() + annotate(geom = "text", x = 91, y = 33,
                            label = "A surprisingly high \n recovery rate.",
                            hjust = 0)

3.2.6 Understanding Scales, Guides, and Themes

Learning about new geoms extended what we have seen already. Each geom makes a different type of plot. Different plots require different mappings in order to work, and so each geom_ function takes mappings tailored to the kind of graph it draws. You can’t use geom_point() to make a scatterplot without supplying an x and a y mapping, for example. Using geom_histogram() only requires you to supply an x mapping. Similarly, geom_pointrange() requires ymin and ymax mappings in order to know where to draw the lineranges it makes. A geom_ function may take optional arguments, too. When using geom_boxplot() you can specify what the outliers look like using arguments like outlier.shape and outlier.color.

Now we’ll make use of new functions for controlling some aspects of the appearance of our graph.

  • Every aesthetic mapping has a scale. If you want to adjust how that scale is marked or graduated, then you use a scale_ function.
  • Many scales come with a legend or key to help the reader interpret the graph. These are called guides. You can make adjustments to them with the guides() function. Perhaps the most common use case is to make the legend disappear, as it is sometimes superfluous. Another is to adjust the arrangement of the key in legends and colorbars.
  • Graphs have other features not strictly connected to the logical structure of the data being displayed. These include things like their background color, the typeface used for labels, or the placement of the legend on the graph. To adjust these, use the theme() function.

Consistent with ggplot’s overall approach, adjusting some visible feature of the graph means first thinking about the relationship that the feature has with the underlying data. Roughly speaking, if the change you want to make will affect the substantive interpretation of any particular geom, then most likely you will either be mapping an aesthetic to a variable using that geom’s aes() function, or you will be specifying a change via some scale_ function. If the change you want to make does not affect the interpretation of a given geom_, then most likely you will either be setting a variable inside the geom_ function, or making a cosmetic change via the theme() function.

p <- ggplot(data = organdata,
            mapping = aes(x = roads,
                          y = donors,
                          color = world))
p + geom_point()

Scales and guides are closely connected, which can make things confusing. The guide provides information about the scale, such as in a legend or colorbar. Thus, it is possible to make adjustments to guides from inside the various scale_ functions. More often it is easier to use the guides() function directly.

A plot with three aesthetic mappings. The variable roads is mapped to x; donors is mapped to y; and world is mapped to color. The x and y scales are both continuous, running smoothly from just under the lowest value of the variable to just over the highest value. Various labeled tick marks orient the reader to the values on each axis. The color mapping also has a scale. The world measure is an unordered categorical variable, so its scale is discrete. It takes one of four values, each represented by a different color.

Along with color, mappings like fill, shape, and size will have scales that we might want to customize or adjust. We could have mapped world to shape instead of color. In that case our four-category variable would have a scale consisting of four different shapes. Scales for these mappings may have labels, axis tick marks at particular positions, or specific colors or shapes. If we want to adjust them, we use one of the scale_ functions.

Many different kinds of variable can be mapped. More often than not x and y are continuous measures. But they might also easily be discrete, as when we mapped country names to the y axis in our boxplots and dotplots. An x or y mapping can also be defined as a transformation onto a log scale, or as a special sort of number value like a date. Similarly, a color or a fill mapping can be discrete and unordered, as with our world variable, or discrete and ordered, as with letter grades in an exam. A color or fill mapping can also be a continuous quantity, represented as a gradient running smoothly from a low to a high value. Finally, both continuous gradients and ordered discrete values might have some defined neutral midpoint with extremes diverging in both directions.

Because we have several potential mappings, and each mapping might be to one of several different scales, we end up with a lot of individual scale_ functions. Each deals with one combination of mapping and scale. They are named according to a consistent logic: *scale__*. First comes the scale_ name, then the mapping it applies to, and finally the kind of value the scale will display. Thus, the scale_x_continuous() function controls x scales for continuous variables; scale_y_discrete() adjusts y scales for discrete variables; and scale_x_log10() transforms an x mapping to a log scale. Most of the time, ggplot will guess correctly what sort of scale is needed for your mapping. Then it will work out some default features of the scale (such as its labels and where the tick marks go). In many cases you will not need to make any scale adjustments. If x is mapped to a continuous variable then adding + scale_x_continuous() to your plot statement with no further arguments will have no effect. It is already there implicitly. Adding + scale_x_log10(), on the other hand, will transform your scale, as now you have replaced the default treatment of a continuous x variable.

If you want to adjust the labels or tick marks on a scale, you will need to know which mapping it is for and what sort of scale it is. Then you supply the arguments to the appropriate scale function. For example, we can change the x-axis of the previous plot to a log scale, and then also change the position and labels of the tick marks on the y-axis.

p <- ggplot(data = organdata,
            mapping = aes(x = roads,
                          y = donors,
                          color = world))
p + geom_point() +
    scale_x_log10() +
    scale_y_continuous(breaks = c(5, 15, 25),
                       labels = c("Five", "Fifteen", "Twenty Five"))

The same applies to mappings like color and fill. Here the available scale_ functions include ones that deal with continuous, diverging, and discrete variables, as well as others that we will encounter later when we discuss the use of color and color palettes in more detail. When working with a scale that produces a legend, we can also use this its scale_ function to specify the labels in the key. To change the title of the legend, however, we use the labs() function, which lets us label all the mappings.

p <- ggplot(data = organdata,
            mapping = aes(x = roads,
                          y = donors,
                          color = world))
p + geom_point() +
    scale_color_discrete(labels =
                             c("Corporatist", "Liberal",
                               "Social Democratic", "Unclassified")) +
    labs(x = "Road Deaths",
         y = "Donor Procurement",
        color = "Welfare State")

If we want to move the legend somewhere else on the plot, we are making a purely cosmetic decision and that is the job of the theme() function. As we have already seen, adding + theme(legend.position = “top”) will move the legend as instructed. Finally, to make the legend disappear altogether, we tell ggplot that we do not want a guide for that scale.

We will use scale_ functions fairly regularly to make small adjustments to the labels and axes of our graphs. And we will occasionally use the theme() function to make some cosmetic adjustments.

p <- ggplot(data = organdata,
            mapping = aes(x = roads,
                          y = donors,
                          color = world))
p + geom_point() +
    labs(x = "Road Deaths",
         y = "Donor Procurement") +
    guides(color = FALSE)

3.3 Refine your plots

# Progressive enhancements of the same plot
# v1
p <- ggplot(data = subset(asasec, Year == 2014),
            mapping = aes(x = Members, y = Revenues, label = Sname))

p + geom_point() + geom_smooth()

## `geom_smooth()` using method = 'loess' and formula 'y ~ x'

# v2
p <- ggplot(data = subset(asasec, Year == 2014),
            mapping = aes(x = Members, y = Revenues, label = Sname))

p + geom_point(mapping = aes(color = Journal)) +
    geom_smooth(method = "lm")

# v3: 
p0 <- ggplot(data = subset(asasec, Year == 2014),
             mapping = aes(x = Members, y = Revenues, label = Sname))

p1 <- p0 + geom_smooth(method = "lm", se = FALSE, color = "gray80") +
    geom_point(mapping = aes(color = Journal)) 

# v4
p2 <- p1 + geom_text_repel(data=subset(asasec,
                                       Year == 2014 & Revenues > 7000),
                           size = 2)
p2

# v5
p3 <- p2 + labs(x="Membership",
        y="Revenues",
        color = "Section has own Journal",
        title = "ASA Sections",
        subtitle = "2014 Calendar year.",
        caption = "Source: ASA annual report.")
p4 <- p3 + scale_y_continuous(labels = scales::dollar) +
     theme(legend.position = "bottom")
p4

3.3.1 Use color to your advantage

You should choose a color palette in the first place based on its ability to express the data you are plotting. Take care to choose a palette that reflects the structure of your data. Separate from these mapping issues, there are considerations about which colors in particular to choose. In general, the default color palettes that ggplot makes available are well-chosen for their perceptual properties and aesthetic qualities. We can also use color and color layers as device for emphasis, to highlight particular data points or parts of the plot, perhaps in conjunction with other features.

We choose color palettes for mappings through one of the scale_ functions for color or fill. While it is possible to very finely control the look of your color schemes by varying the hue, chroma, and luminance of each color you use via scale_color_hue(), or scale_fill_hue(), in general this is not recommended. Instead you should use the RColorBrewer package to make a wide range of named color palettes available to you. When used in conjunction with ggplot, you access these colors by specifying the scale_color_brewer() or scale_fill_brewer() functions, depending on the aesthetic you are mapping.

You can also specify colors manually, via scale_color_manual() or scale_fill_manual(). These functions take a value argument that can be specified as vector of color names or color values that R knows about. The ability to manually specify colors can be useful when the meaning of a category itself has a strong color association. R knows many color names (like red, and green, and cornflowerblue. Try demo(‘colors’) for an overview. Alternatively, color values can be specified via their hexadecimal RGB value. This is a way of encoding color values in the RGB colorspace, where each channel can take a value from 0 to 255 like this. A color hex value begins with a hash or pound character, #, followed by three pairs of hexadecimal or “hex” numbers. Hex values are in Base 16, with the first six letters of the alphabet standing for the numbers 10 to 15. This allows a two-character hex number to range from 0 to 255. You read them as #rrggbb, where rr is the two-digit hex code for the red channel, gg for the green channel, and bb for the blue channel. So #CC55DD translates in decimal to CC = 204 (red), 55 = 85 (green), and DD = 221 (blue). It gives a strong pink color.

If we are serious about using a safe palette for color-blind viewers, we should investigate the dichromat package (The colorblindr package has similar functionality) instead. It provides a range of safe palettes and some useful functions for helping you approximately see what your current palette might look like to a viewer with one of several different kinds of color blindness.

p <- ggplot(data = organdata,
            mapping = aes(x = roads, y = donors, color = world))
p + geom_point(size = 2) + scale_color_brewer(palette = "Set2") +
    theme(legend.position = "top")

p + geom_point(size = 2) + scale_color_brewer(palette = "Pastel2") +
        theme(legend.position = "top")

p + geom_point(size = 2) + scale_color_brewer(palette = "Dark2") +
    theme(legend.position = "top")

# Defining your own palette
cb_palette <- c("#999999", "#E69F00", "#56B4E9", "#009E73",
                "#F0E442", "#0072B2", "#D55E00", "#CC79A7")

p4 + scale_color_manual(values = cb_palette)

# Setting default color palette
Default <- brewer.pal(5, "Set2")

# safety colors from dichromat 
types <- c("deutan", "protan", "tritan")
names(types) <- c("Deuteronopia", "Protanopia", "Tritanopia")

color_table <- types %>%
    purrr::map(~ dichromat(Default, .x)) %>%
    as_tibble() %>%
    add_column(Default, .before = TRUE)

color_table
Default Deuteronopia Protanopia Tritanopia
#66C2A5 #AEAEA7 #BABAA5 #82BDBD
#FC8D62 #B6B661 #9E9E63 #F29494
#8DA0CB #9C9CCB #9E9ECB #92ABAB
#E78AC3 #ACACC1 #9898C3 #DA9C9C
#A6D854 #CACA5E #D3D355 #B6C8C8
color_comp(color_table)

3.3.2 Layer color and text together

Aside from mapping variables directly, color is also very useful when we want to pick out or highlight some aspect of our data. In cases like this that the layered approach of ggplot can really work to our advantage.

We will build up a plot of data about the 2016 US general election. It is contained in the county_data object in the socviz library. We begin by defining a blue and red color for the Democrats and Republicans. Then we create the basic setup and first layer of the plot. We subset the data, including only counties with a value of “No” on the flipped variable. We set the color of geom_point() to be a light gray, as it will form the background layer of the plot. And we apply a log transformation to the x-axis scale.

In the next step we add a second geom_point() layer. Here we start with the same dataset but extract a complementary subset from it. This time we choose the “Yes” counties on the flipped variable. The x and y mappings are the same, but we add a color scale for these points, mapping the partywinner16 variable to the color aesthetic. Then we specify a manual color scale with scale_color_manual(), where the values are the blue and red party_colors we defined above.

The next layer sets the y-axis scale and the labels.

Finally, we add a third layer using the geom_text_repel() function. Once again we supply a set of instructions to subset the data for this text layer. We are interested in the flipped counties that have with a relatively high percentage of African-American residents.

# Democrat Blue and Republican Red
party_colors <- c("#2E74C0", "#CB454A")

p0 <- ggplot(data = subset(county_data,
                           flipped == "No"),
             mapping = aes(x = pop,
                           y = black/100))

p1 <- p0 + geom_point(alpha = 0.15, color = "gray50") +
    scale_x_log10(labels=scales::comma) 

p1

p2 <- p1 + geom_point(data = subset(county_data,
                                    flipped == "Yes"),
                      mapping = aes(x = pop, y = black/100,
                                    color = partywinner16)) +
    scale_color_manual(values = party_colors)

p2

p3 <- p2 + scale_y_continuous(labels=scales::percent) +
    labs(color = "County flipped to ... ",
         x = "County Population (log scale)",
         y = "Percent Black Population",
         title = "Flipped counties, 2016",
         caption = "Counties in gray did not flip.")

p3

p4 <- p3 + geom_text_repel(data = subset(county_data,
                                      flipped == "Yes" &
                                      black  > 25),
                           mapping = aes(x = pop,
                                   y = black/100,
                                   label = state), size = 2)

p4 + theme_minimal() +
    theme(legend.position="top")

3.3.3 Change the appearance of plots with themes

If we want to change the overall look of it all at once, we can do that using ggplot’s theme engine. Themes can be turned on or off using the theme_set() function. It takes the name of a theme (which will itself be a function) as an argument.

Internally, theme functions are a set of detailed instructions to turn on, turn off, or modify a large number of graphical elements on the plot. Once set, a theme applies to all subsequent plots and it remains active until it is replaced by a different theme. This be done either through the use of another theme_set() statement, or on a per-plot basis by adding the theme function to the end of the plot: p4 + theme_gray() would temporarily override the generally active theme for the p4 object only. You can still use the theme() function to fine-tune any aspect of your plot, as seen above with the relocation of the legend to the top of the graph.

The ggplot library comes with several built-in themes, including theme_minimal() and theme_classic(), with theme_gray() or theme_grey() as the default. If these are not to your taste, install the ggthemes library for many more options.

You can define your own themes either entirely from scratch, or by starting with one you like and making adjustments from there.

Wilke’s cowplot package, for instance, contains a well-developed theme suitable for figures whose final destination is a journal article. Bob Rudis’s hrbrthemes package, meanwhile, has a distinctive and compact look and feel that takes advantage of some freely-available typefaces.

The theme() function allows you to exert very fine-grained control over the appearance of all kinds of text and graphical elements in a plot.

# theme_set(theme_bw())
# p4 + theme(legend.position="top")
# 
# theme_set(theme_dark())
# p4 + theme(legend.position="top")
# 
# theme_set(theme_economist())
# p4 + theme(legend.position="top")

# theme_set(theme_wsj())

p4 + theme(plot.title = element_text(size = rel(0.6)),
           legend.title = element_text(size = rel(0.35)),
           plot.caption = element_text(size = rel(0.35)),
           legend.position = "top")

p4 + theme(legend.position = "top")

p4 + theme(legend.position = "top",
           plot.title = element_text(size=rel(2),
                                     lineheight=.5,
                                     family="Times",
                                     face="bold.italic",
                                     colour="orange"),
           axis.text.x = element_text(size=rel(1.1),
                                      family="Courier",
                                      face="bold",
                                      color="purple"))

### Use Theme Elements in a Substantive Way

The gss_lon data contains information on the age of each GSS respondent for all the years in the survey since 1972. We will fill the density curves with a dark grey color, and then add an indicator of the mean age in each year, and a text layer for the label. With those in place we then adjust the detail of several theme elements, mostly to remove them. As before we use element_text() to tweak the appearance of various text elements such as titles and labels. And we also use element_blank() to remove several of them altogether. First, we need to calculate the mean age of the respondents for each year of interest. Because the GSS has been around for most (but not all) years since 1972, we will look at distributions about every four years since the beginning.

The initial p object subsets the data by the years we have chosen, and maps x to the age variable. The geom_density() call is the base layer, with arguments to turn off its default line color, set the fill to a shade of gray, and scale the y-axis between zero and one. Using our summarized dataset, the geom_vline() layer draws a vertical white line at the mean age of the distribution.

The ggridges package offers a different take on small-multiple density plots by allowing the distributions to overlap vertically to interesting effect. It is especially useful for repeated distributional measures that change in a clear direction. The expand argument in scale_y_discrete() adjusts the scaling of the y-axis slightly. The package also comes with its own theme, theme_ridges() that adjusts the labels so that they are aligned properly. The degree of overlap in the distributions is controlled by the scale argument in the geom.

Setting these thematic elements in an ad hoc way is often one of the first things people want to do when they make plot. But making small adjustments to theme elements should be the very last thing you do in the plotting process. Ideally, once you have set up a theme that works well for you, it should be something you can avoid having to do at all.

yrs <- c(seq(1972, 1988, 4), 1993, seq(1996, 2016, 4))
yrs
##  [1] 1972 1976 1980 1984 1988 1993 1996 2000 2004 2008 2012 2016
mean_age <- gss_lon %>%
    filter(age %nin% NA && year %in% yrs) %>%
    group_by(year) %>%
    summarize(xbar = round(mean(age, na.rm = TRUE), 0))
mean_age
year xbar
1972 45
1973 44
1974 45
1975 44
1976 45
1977 45
1978 44
1980 45
1982 45
1983 44
1984 44
1985 46
1986 45
1987 45
1988 45
1989 45
1990 46
1991 46
1993 46
1994 46
1996 45
1998 46
2000 46
2002 46
2004 46
2006 47
2008 48
2010 48
2012 48
2014 49
2016 49
mean_age$y <- 0.3

yr_labs <- data.frame(x = 85, y = 0.8,
                      year = yrs)

# First, we create the plot structure
p <- ggplot(data = subset(gss_lon, year %in% yrs),
            mapping = aes(x = age))

p1 <- p + geom_density(fill = "gray20", color = FALSE,
                       alpha = 0.9, mapping = aes(y = ..scaled..)) +
    geom_vline(data = subset(mean_age, year %in% yrs),
               aes(xintercept = xbar), color = "white", size = 0.5) +
    geom_text(data = subset(mean_age, year %in% yrs),
              aes(x = xbar, y = y, label = xbar), nudge_x = 7.5,
              color = "white", size = 3.5, hjust = 1) +
    geom_text(data = subset(yr_labs, year %in% yrs),
              aes(x = x, y = y, label = year)) +
    facet_grid(year ~ ., switch = "y")

# With the structure of the plot in place, we then style the elements in the way that we want, using a series of instructions to theme().
# p1 + theme_book(base_size = 10, plot_title_size = 10,
#                 strip_text_size = 32, panel_spacing = unit(0.1, "lines")) +
#     theme(plot.title = element_text(size = 16),
#           axis.text.x= element_text(size = 12),
#           axis.title.y=element_blank(),
#           axis.text.y=element_blank(),
#           axis.ticks.y = element_blank(),
#           strip.background = element_blank(),
#           strip.text.y = element_blank(),
#           panel.grid.major = element_blank(),
#           panel.grid.minor = element_blank()) +
#     labs(x = "Age",
#          y = NULL,
#          title = "Age Distribution of\nGSS Respondents")

# Using the ggridges package
p <- ggplot(data = gss_lon,
            mapping = aes(x = age, y = factor(year, levels = rev(unique(year)),
                                     ordered = TRUE)))

p + geom_density_ridges(alpha = 0.6, fill = "lightblue", scale = 1.5) +
    scale_x_continuous(breaks = c(25, 50, 75)) +
    scale_y_discrete(expand = c(0.01, 0)) + 
    labs(x = "Age", y = NULL,
         title = "Age Distribution of\nGSS Respondents") +
    theme_ridges() +
    theme(title = element_text(size = 16, face = "bold"))

### Case Studies #### Two y-axes R makes it slightly tricky to draw graphs with two y-axes. In fact, ggplot rules it out of order altogether. It is possible to do it using R’s base graphics. Most of the time when people draw plots with two y-axes they want to line the series up as closely as possible because they suspect that there’s a substantive association between them. The main problem with using two y-axes is that it makes it even easier than usual to fool yourself (or someone else) about the degree of association between the variables. This is because you can adjust the scaling of the axes to relative to one another in way that moves the data series around more or less however you like.

We could use a split- or broken-axis plot to show the two series at the same time. These can be effective sometimes, and they seem to have better perceptual properties than overlayed charts with dual axes. Another compromise, if the series are not in the same units (or of widely differing magnitudes), is to rescale one of the series (e.g., by dividing or multiplying it by a thousand), or alternatively to index each of them to 100 at the start of the first period, and then plot them both. Index numbers can have complications of their own, but here they allow us use one axis instead of two, and also to calculate a sensible difference between the two series and plot that as well.

Now we have our two plots, we want to lay them out nicely. We do not want them to appear in the same plot area, but we do want to compare them. It would be possible to do this with a facet, but that would mean doing a fair amount of data munging to get all three series (the two indices and the difference between them) into the same tidy data frame. An alternative is to make two separate plots and then arrange them just as we like. The cowplot library makes things easy. It has a plot_grid() function that works much like grid.arrange() while also taking care of some fine details, including the proper alignment of axes across separate plot objects.

The broader problem with dual-axis plots of this sort is that the apparent association between these variables is probably spurious. The original plot is enabling our desire to spot patterns, but substantively it is probably the case that both of these time series are tending to increase, but are not otherwise related in any deep way. The use of dual axes is not recommended in general because is already much too easy to present spurious, or at least overconfident, associations, especially with time series data. Scatterplots can do that just fine. Even with a single series, we can make associations look steeper or flatter by fiddling with the aspect ratio. Using two y-axes gives you an extra degree of freedom to mess about with the data.

# Tidying data
head(fredts)
date sp500 monbase sp500_i monbase_i
2009-03-11 696.68 1542228 100.0000 100.0000
2009-03-18 766.73 1693133 110.0548 109.7849
2009-03-25 799.10 1693133 114.7012 109.7849
2009-04-01 809.06 1733017 116.1308 112.3710
2009-04-08 830.61 1733017 119.2240 112.3710
2009-04-15 852.21 1789878 122.3245 116.0579
fredts_m <- fredts %>% select(date, sp500_i, monbase_i) %>%
    gather(key = series, value = score, sp500_i:monbase_i)

head(fredts_m)
date series score
2009-03-11 sp500_i 100.0000
2009-03-18 sp500_i 110.0548
2009-03-25 sp500_i 114.7012
2009-04-01 sp500_i 116.1308
2009-04-08 sp500_i 119.2240
2009-04-15 sp500_i 122.3245
# Plotting
p <- ggplot(data = fredts_m,
            mapping = aes(x = date, y = score,
                          group = series,
                          color = series))
p1 <- p + geom_line() + theme(legend.position = "top") +
    labs(x = "Date",
         y = "Index",
         color = "Series")

p <- ggplot(data = fredts,
            mapping = aes(x = date, y = sp500_i - monbase_i))

p2 <- p + geom_line() +
    labs(x = "Date",
         y = "Difference")
cowplot::plot_grid(p1, p2, nrow = 2, rel_heights = c(0.75, 0.25), align = "v") # arrange the plots 

#### Redrawing a bad slide To redraw the chart I took the numbers from the bars on the chart together with employee data from QZ.com. Where there was quarterly data in the slide, I used the end-of-year number for employees, except for 2012. Mayer was appointed in July of 2012. Ideally we would have quarterly revenue and quarterly employee data for all years, but given that we do not, the most sensible thing to do is to keep things annualized except for the one year of interest, when Mayer arrives as CEO. It’s worth doing this because otherwise the large round of layoffs that immediately preceded her arrival would be misattributed to her tenure as CEO. The redrawing is straightforward. We could just draw a scatterplot and color the points by whether Mayer was CEO at the time. We can take a small step further by making a scatterplot but also holding on to the temporal element. We can use geom_path() and use use line segments to “join the dots” of the yearly observations in order, labeling each point with its year.

Alternatively, we can keep the analyst community happy by putting time back on the x-axis and plotting the ratio of revenue to employees on the y-axis.

headTail(yahoo)
Year Revenue Employees Mayer
2004 3574 7600 No
2005 5257 9800 No
2006 6425 11400 No
2007 6969 14300 No
NA
2012 4986 12000 No
2012 4986 11500 Yes
2013 4680 12200 Yes
2014 4618 12500 Yes
p <- ggplot(data = yahoo,
            mapping = aes(x = Employees, y = Revenue))
p + geom_path(color = "gray80") +
    geom_text(aes(color = Mayer, label = Year), # highlight points of interest 
              size = 3, fontface = "bold") +
    theme(legend.position = "bottom") +
    labs(color = "Mayer is CEO",
         x = "Employees", y = "Revenue (Millions)",
         title = "Yahoo Employees vs Revenues, 2004-2014") +
    scale_y_continuous(labels = scales::dollar) +
    scale_x_continuous(labels = scales::comma)

# Alternative version
p <- ggplot(data = yahoo,
            mapping = aes(x = Year, y = Revenue/Employees))

p + geom_vline(xintercept = 2012) +
    geom_line(color = "gray60", size = 2) +
    annotate("text", x = 2013, y = 0.44,
             label = " Mayer becomes CEO", size = 2.5) +
    labs(x = "Year\n",
         y = "Revenue/Employees",
         title = "Yahoo Revenue to Employee Ratio, 2004-2014")

#### Saying no to pie

There is a reasonable amount of customization in this graph. First, the text of the facets is made bold in the theme() call. The graphical element is first named (strip.text.x) and then modified using the element_text() function. We also use a custom palette for the fill mapping, via scale_fill_brewer(). And finally we relabel the facets to something more informative than their bare variable names. This is done using the labeller argument and the as_labeller() function inside the facet_grid() call. At the beginning of the plotting code, we set up an object called f_labs, which is in effect a tiny data frame that associates new labels with the values of the type variable in studebt. We use backticks (the angled quote character located next to the ‘1’ key on US keyboards) to pick out the values we want to relabel. The as_labeller() function takes this object and uses it to create new text for the labels when facet_grid() is called.

When the categorical axis labels are long, though, I generally find it’s easier to read them on the y-axis. The colors on the graph are not encoding or mapping any information in the data that is not already taken care of by the faceting. The fill mapping is useful, but also redundant. This graph could easily be in black and white, and would be just as informative if it were.

One thing that is not emphasized in a faceted chart like this is the idea that each of the debt categories is a share or percentage of a total amount.

Instead of having separate bars distinguished by heights, we can array the percentages for each distribution proportionally within a single bar. We will make a stacked bar chart. We are careful to map the income categories in an ascending sequence of colors, and to adjust the key so that the values run from low to high, from left to right, and from yellow to purple. This is done partly by switching the fill mapping from Debt to Debtrc. The categories of the latter are the same as the former, but the sequence of income levels is coded in the order we want.

The rest of the work is done in the guides() call. We give guides() a series of instructions about the fill mapping: reverse the direction of the color coding; put the legend title above the key; put the labels for the colors below the key; widen the width of the color boxes a little, and place the whole key on a single row.

head(studebt)
Debt type pct Debtrc
Under $5 Borrowers 20 Under $5
$5-$10 Borrowers 17 $5-$10
$10-$25 Borrowers 28 $10-$25
$25-$50 Borrowers 19 $25-$50
$50-$75 Borrowers 8 $50-$75
$75-$100 Borrowers 3 $75-$100
# setting up some labels in advance, as we will reuse them
p_xlab <- "Amount Owed, in thousands of Dollars"
p_title <- "Outstanding Student Loans"
p_subtitle <- "44 million borrowers owe a total of $1.3 trillion"
p_caption <- "Source: FRB NY"

# a special label for the facets
f_labs <- c(`Borrowers` = "Percent of\nall Borrowers",
            `Balances` = "Percent of\nall Balances")

p <- ggplot(data = studebt,
            mapping = aes(x = Debt, y = pct/100, fill = type))
p + geom_bar(stat = "identity") +
    scale_fill_brewer(type = "qual", palette = "Dark2") +
    scale_y_continuous(labels = scales::percent) +
    guides(fill = FALSE) +
    theme(strip.text.x = element_text(face = "bold")) +
    labs(y = NULL, x = p_xlab,
      caption = p_caption,
      title = p_title,
      subtitle = p_subtitle) +
    facet_grid(~ type, labeller = as_labeller(f_labs)) +
    coord_flip()

# stacked bar chart
p <- ggplot(studebt, aes(y = pct/100, x = type, fill = Debtrc)) # pct/100 to plot as pct 
p + geom_bar(stat = "identity", color = "gray80") + # we set the border colors of the bars to a light gray in geom_bar() to make the bar segments easier to distinguish. 
  scale_x_discrete(labels = as_labeller(f_labs)) +
  scale_y_continuous(labels = scales::percent) +
  scale_fill_viridis(discrete = TRUE) + # using scale_fill_viridis() for the color palette 
  guides(fill = guide_legend(reverse = TRUE,
                             title.position = "top",
                             label.position = "bottom",
                             keywidth = 3,
                             nrow = 1)) +
  labs(x = NULL, y = NULL,
       fill = "Amount Owed, in thousands of dollars",
       caption = p_caption,
       title = p_title,
       subtitle = p_subtitle) +
  theme(legend.position = "top",
        axis.text.y = element_text(face = "bold", hjust = 1, size = 12),
        axis.ticks.length = unit(0, "cm"),
        panel.grid.major.y = element_blank()) +
  coord_flip()

4 Fundamentals of Data Visualization (Claus O. Wilke) + SDS 375

4.1 Aesthetic mappings

# data preparation
temperatures <- read_csv("input/tempnormals.csv")

# mapping aesthetics to data
ggplot(data = temperatures, aes(x = day_of_year, y = temperature, color = location)) +
  geom_line()

ggplot(temperatures, aes(x = day_of_year, y = location, color = temperature)) +
  geom_point(size = 5)

ggplot(temperatures, aes(month,temperature, color = location)) +
  geom_boxplot()

ggplot(temperatures, aes(month, temperature, fill = location)) +
  geom_violin() +
  facet_wrap(~ location )

# Color and fill apply to different things
# Many geoms have both color and fill aesthetics
ggplot(temperatures, aes(month, temperature, color = location)) + 
  geom_boxplot()

ggplot(temperatures, aes(month, temperature, fill = location)) + 
  geom_boxplot()

# Aesthetics can also be used as parameters in geoms
ggplot(temperatures, aes(month, temperature, fill = location)) + 
  geom_boxplot(color = "steelblue")

4.2 Visualizing amounts

boxoffice <- tibble(
  rank = 1:5,
  title = c("Star Wars", "Jumanji", "Pitch Perfect 3", "Greatest Showman", "Ferdinand"),
  amount = c(71.57, 36.17, 19.93, 8.81, 7.32) # million USD
)

ggplot(boxoffice, aes(title, amount)) +
  geom_col()

# Order by data value
ggplot(boxoffice, aes(fct_reorder(title, amount), amount)) +
  geom_col()

# Order by data value, descending
ggplot(boxoffice, aes(fct_reorder(title, -amount), amount)) +
  geom_col() + 
  xlab(NULL) # remove x axis label

# Flip x and y, set custom x axis label
ggplot(boxoffice, aes(amount, fct_reorder(title, amount))) +
  geom_col() +
  xlab("amount (in million USD)") +
  ylab(NULL)

# Use geom_bar() to count before plotting
ggplot(penguins, aes(y = species)) + # note: no x aesthetic defined
  geom_bar()

# Getting the bars into the right order
ggplot(penguins, aes(y = fct_relevel(species, "Chinstrap", "Gentoo", "Adelie"))) + # Manually, using fct_relevel()
  geom_bar() +
  ylab(NULL)

ggplot(penguins, aes(y = fct_reorder(species, species, length))) + # Using fct_reorder + length
  geom_bar() +
  ylab(NULL)

# Display counts by species and sex
ggplot(penguins, aes(sex, fill = species)) +
  geom_bar()

penguins_nomissing <- na.omit(penguins) # remove all rows with any missing values
ggplot(penguins_nomissing, aes(sex, fill = species)) +
  geom_bar()

# Positions define how subgroups are shown
ggplot(penguins_nomissing, aes(sex, fill = species)) +
  geom_bar(position = "dodge") # position = "dodge": Place bars for subgroups side-by-side 

ggplot(penguins_nomissing, aes(sex, fill = species)) +
  geom_bar(position = "stack") # position = "stack": Place bars for subgroups on top of each other 

ggplot(penguins_nomissing, aes(sex, fill = species)) +
  geom_bar(position = "fill") # position = "fill": Like "stack", but scale to 100% 

4.3 Visualizing distributions

# import data
titanic <- read_csv("input/titanic.csv")
lincoln_temps <- lincoln_weather %>%
  mutate(
    date = ymd(CST),
    month_long = Month,
    month = fct_recode(
      Month,
      Jan = "January",
      Feb = "February",
      Mar = "March",
      Apr = "April",
      May = "May",
      Jun = "June",
      Jul = "July",
      Aug = "August",
      Sep = "September",
      Oct = "October",
      Nov = "November",
      Dec = "December"
    ),
    mean_temp = `Mean Temperature [F]`
  ) %>%
  select(date, month, month_long, mean_temp) %>%
  mutate(month = fct_rev(month)) # fct_recode() places levels in reverse order

# Making histograms and setting the bin width
ggplot(titanic, aes(age)) +
  geom_histogram(binwidth = 5)

# Always set the center as well
ggplot(titanic, aes(age)) +
  geom_histogram(
    binwidth = 5,  # width of the bins
    center = 2.5   # center of the bin containing that value
  )

# Making density plots
ggplot(titanic, aes(age)) +
  geom_density(fill = "skyblue")

# Modifying bandwidth (bw) and kernel parameters
ggplot(titanic, aes(age)) +
  geom_density(
    fill = "skyblue",
    bw = 0.5,               # a small bandwidth
    kernel = "gaussian"     # Gaussian kernel (the default)
  )

ggplot(titanic, aes(age)) +
  geom_density(
    fill = "skyblue",
    bw = 2,                 # a moderate bandwidth
    kernel = "rectangular"  # rectangular kernel
  )

# Statistical transformations (stats) can be set explicitly
ggplot(titanic, aes(age)) +
  geom_density(
    stat = "density",    # the default for geom_density()
    fill = "skyblue"
  )

ggplot(titanic, aes(age)) +
  geom_area(  # geom_area() does not normally use stat = "density"
    stat = "density",
    fill = "skyblue"
  )

ggplot(titanic, aes(age)) +
  geom_line(  # neither does geom_line()
    stat = "density"
  )

ggplot(titanic, aes(age)) +
  # we can use multiple geoms on top of each other
  geom_area(stat = "density", fill = "skyblue") +
  geom_line(stat = "density")

# Parameters are handed through to the stat
ggplot(titanic, aes(age)) +
  geom_line(stat = "density", bw = 3) # bw is a parameter of stat_density(), not of geom_line() 

ggplot(titanic, aes(age)) +
  geom_line(stat = "density", bw = 0.3)

# We can explicitly map results from stat computations
ggplot(titanic, aes(age)) +
  geom_tile( # geom_tile() draws rectangular colored areas
    aes(
      y = 1, # draw all tiles at the same y location
      fill = after_stat(density)  # use computed density for fill
    ),
    stat = "density",
    n = 20    # number of points calculated by stat_density() 
  )

ggplot(titanic, aes(age)) +
  geom_tile( # geom_tile() draws rectangular colored areas
    aes(
      y = 1, # draw all tiles at the same y location
      fill = after_stat(density)  # use computed density for fill
    ),
    stat = "density",
    n = 200   # number of points calculated by stat_density() 
  )

# Boxplot
ggplot(lincoln_temps, aes(x = month, y = mean_temp)) +
  geom_boxplot(fill = "skyblue")

# Violin plot
ggplot(lincoln_temps, aes(x = month, y = mean_temp)) +
  geom_violin(fill = "skyblue")

# Strip chart
ggplot(lincoln_temps, aes(x = month, y = mean_temp)) +
  geom_point(size = 0.75)  # reduce point size to minimize overplotting

ggplot(lincoln_temps, aes(x = month, y = mean_temp)) +
  geom_point(size = 0.75,  # reduce point size to minimize overplotting 
    position = position_jitter(
      width = 0.15,  # amount of jitter in horizontal direction
      height = 0     # amount of jitter in vertical direction (0 = none)
    )
  )

# Sina plot
ggplot(lincoln_temps, aes(x = month, y = mean_temp)) +
  geom_violin(fill = "skyblue", color = NA) + # violins in background
  geom_sina(size = 0.75) # sina jittered points in foreground

# Ridgeline plot
ggplot(lincoln_temps, aes(x = mean_temp, y = month_long)) +
  geom_density_ridges()

4.4 Coordinate systems and axes

# import data
US_census <- read_csv("https://wilkelab.org/SDS375/datasets/US_census.csv")
tx_counties <- US_census %>% 
  filter(state == "Texas") %>%
  select(name, pop2010) %>%
  extract(name, "county", regex = "(.+) County") %>%
  mutate(popratio = pop2010/median(pop2010)) %>%
  arrange(desc(popratio)) %>%
  mutate(index = 1:n())

# The parameter name sets the axis title
ggplot(boxoffice) +
  aes(amount, fct_reorder(title, amount)) +
  geom_col() +
  scale_x_continuous(
    name = "weekend gross (million USD)" # We could do the same with xlab() and ylab()   
  ) +
  scale_y_discrete(
    name = NULL  # no axis title
  )

# The parameter limits sets the scale limits
ggplot(boxoffice) +
  aes(amount, fct_reorder(title, amount)) +
  geom_col() +
  scale_x_continuous(
    name = "weekend gross (million USD)",
    limits = c(0, 80) # We could do the same with xlim() and ylim() 
  ) +
  scale_y_discrete(
    name = NULL
  )

# The parameter breaks sets the axis tick positions
ggplot(boxoffice) +
  aes(amount, fct_reorder(title, amount)) +
  geom_col() +
  scale_x_continuous(
    name = "weekend gross (million USD)",
    limits = c(0, 80),
    breaks = c(0, 25, 50, 75)
  ) +
  scale_y_discrete(
    name = NULL
  )

# The parameter labels sets the axis tick labels
ggplot(boxoffice) +
  aes(amount, fct_reorder(title, amount)) +
  geom_col() +
  scale_x_continuous(
    name = "weekend gross",
    limits = c(0, 80),
    breaks = c(0, 25, 50, 75),
    labels = c("0", "$25M", "$50M", "$75M")
  ) +
  scale_y_discrete(
    name = NULL
  )

# The parameter expand sets the axis expansion
ggplot(boxoffice) +
  aes(amount, fct_reorder(title, amount)) +
  geom_col() +
  scale_x_continuous(
    name = "weekend gross (million USD)",
    limits = c(0, 80),
    breaks = c(0, 25, 50, 75),
    labels = c("0", "$25M", "$50M", "$75M"),
    expand = expansion(mult = c(0, 0.06))
  ) +
  scale_y_discrete(
    name = NULL
  )

# Linear y scale
ggplot(tx_counties) +
  aes(x = index, y = popratio) +
  geom_point() +
  scale_y_continuous(
    name = "population number / median",
    breaks = c(0, 100, 200),
    labels = c("0", "100", "200")
  )

# Log y scale
ggplot(tx_counties) +
  aes(x = index, y = popratio) +
  geom_point() +
  scale_y_log10(
    name = "population number / median",
    breaks = c(0.01, 1, 100),
    labels = c("0.01", "1", "100")
  )

# Coords define the coordinate system
ggplot(temperatures, aes(day_of_year, temperature, color = location)) +
  geom_line() +
  coord_cartesian()  # cartesian coords are the default

ggplot(temperatures, aes(day_of_year, temperature, color = location)) +
  geom_line() +
  coord_polar()   # polar coords

ggplot(temperatures, aes(day_of_year, temperature, color = location)) +
  geom_line() +
  coord_polar() + 
  scale_y_continuous(limits = c(0, 105))  # fix up temperature limits

4.5 Color scales

# Data input
temperatures <- read_csv("https://wilkelab.org/SDS375/datasets/tempnormals.csv") %>%
  mutate(
    location = factor(
      location, levels = c("Death Valley", "Houston", "San Diego", "Chicago")
    )
  ) %>%
  select(location, day_of_year, month, temperature)

temps_months <- read_csv("https://wilkelab.org/SDS375/datasets/tempnormals.csv") %>%
  group_by(location, month_name) %>%
  summarize(mean = mean(temperature)) %>%
  mutate(
    month = factor(
      month_name,
      levels = c("Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec")
    ),
    location = factor(
      location, levels = c("Death Valley", "Houston", "San Diego", "Chicago")
    )
  ) %>%
  select(-month_name)


US_regions <- read_csv("input/US_regions.csv")
popgrowth <- left_join(US_census, US_regions) %>%
    group_by(region, division, state) %>%
    summarize(pop2000 = sum(pop2000, na.rm = TRUE),
              pop2010 = sum(pop2010, na.rm = TRUE),
              popgrowth = (pop2010-pop2000)/pop2000,
              area = sum(area)) %>%
    arrange(popgrowth) %>%
    ungroup() %>%
    mutate(state = factor(state, levels = state),
           region = factor(region, levels = c("West", "South", "Midwest", "Northeast")))

# default
ggplot(temps_months, aes(x = month, y = location, fill = mean)) + 
  geom_tile(width = 0.95, height = 0.95) + 
  coord_fixed(expand = FALSE) +
  theme_classic()

  # no fill scale defined, default is scale_fill_gradient()

# scale_fill_gradient()
ggplot(temps_months, aes(x = month, y = location, fill = mean)) + 
  geom_tile(width = 0.95, height = 0.95) + 
  coord_fixed(expand = FALSE) +
  theme_classic() +
  scale_fill_gradient()

# scale_fill_viridis_c()
ggplot(temps_months, aes(x = month, y = location, fill = mean)) + 
  geom_tile(width = 0.95, height = 0.95) + 
  coord_fixed(expand = FALSE) +
  theme_classic() +
  scale_fill_viridis_c()

# scale_fill_viridis_c(option = "B")
ggplot(temps_months, aes(x = month, y = location, fill = mean)) + 
  geom_tile(width = 0.95, height = 0.95) + 
  coord_fixed(expand = FALSE) +
  theme_classic() +
  scale_fill_viridis_c(option = "B", begin = 0.15)

# scale_fill_distiller(palette = "YlGnBu")
ggplot(temps_months, aes(x = month, y = location, fill = mean)) + 
  geom_tile(width = 0.95, height = 0.95) + 
  coord_fixed(expand = FALSE) +
  theme_classic() +
  scale_fill_distiller(palette = "YlGnBu")

# using package colorspace
ggplot(temps_months, aes(x = month, y = location, fill = mean)) + 
  geom_tile(width = 0.95, height = 0.95) + 
  coord_fixed(expand = FALSE) +
  theme_classic() +
  colorspace::scale_fill_continuous_sequential(palette = "YlGnBu", rev = FALSE)

ggplot(temps_months, aes(x = month, y = location, fill = mean)) + 
  geom_tile(width = 0.95, height = 0.95) + 
  coord_fixed(expand = FALSE) +
  theme_classic() +
  colorspace::scale_fill_continuous_sequential(palette = "Viridis", rev = FALSE)

ggplot(temps_months, aes(x = month, y = location, fill = mean)) + 
  geom_tile(width = 0.95, height = 0.95) + 
  coord_fixed(expand = FALSE) +
  theme_classic() +
  colorspace::scale_fill_continuous_sequential(palette = "Inferno", begin = 0.15, rev = FALSE)

colorspace::hcl_palettes(type = "sequential", plot = TRUE) # all sequential palettes

colorspace::hcl_palettes(type = "diverging", plot = TRUE, n = 9) # all diverging palettes

colorspace::divergingx_palettes(plot = TRUE, n = 9) # all divergingx palettes

# Discrete, qualitative scales are best set manually
ggplot(popgrowth, aes(x = pop2000, y = popgrowth, color = region)) +
  geom_point() +
  scale_x_log10()

  # no color scale defined, default is scale_color_hue()

ggplot(popgrowth, aes(x = pop2000, y = popgrowth, color = region)) +
  geom_point() +
  scale_x_log10() +
  scale_color_hue()

# library(ggthemes)  # for scale_color_colorblind()
ggplot(popgrowth, aes(x = pop2000, y = popgrowth, color = region)) +
  geom_point() +
  scale_x_log10() +
  scale_color_colorblind()  # uses Okabe-Ito colors

# manually
ggplot(popgrowth, aes(x = pop2000, y = popgrowth, color = region)) +
  geom_point() +
  scale_x_log10() +
  scale_color_manual(
    values = c(West = "#E69F00", South = "#56B4E9", Midwest = "#009E73", Northeast = "#F0E442")
  )

4.6 Figure design

# starting figure
ggplot(lincoln_temps) +
  aes(x = mean_temp, y = month_long) +
  geom_density_ridges()

# geoms (via arguments to geoms)
# Set scale and bandwidth to shape ridgelines
ggplot(lincoln_temps) +
  aes(x = mean_temp, y = month_long) +
  geom_density_ridges(
    scale = 3, bandwidth = 3.4
  )

# Set rel_min_height to cut ridgelines near zero
ggplot(lincoln_temps) +
  aes(x = mean_temp, y = month_long) +
  geom_density_ridges(
    scale = 3, bandwidth = 3.4,
    rel_min_height = 0.01
  )

# scales (via scale_*() functions)
# Use scale_*() functions to specify axis labels
ggplot(lincoln_temps) +
  aes(x = mean_temp, y = month_long) +
  geom_density_ridges(
    scale = 3, bandwidth = 3.4,
    rel_min_height = 0.01,
  ) +
  scale_x_continuous(
    name = "mean temperature (°F)"
  ) +
  scale_y_discrete(
    name = NULL  # NULL means no label
  )

# Specify scale expansion
ggplot(lincoln_temps) +
  aes(x = mean_temp, y = month_long) +
  geom_density_ridges(
    scale = 3, bandwidth = 3.4,
    rel_min_height = 0.01
  ) +
  scale_x_continuous(
    name = "mean temperature (°F)",
    expand = c(0, 0)
  ) +
  scale_y_discrete(
    name = NULL,
    expand = expansion(add = c(0.2, 2.6))
  )

# plot appearance (via themes)
# Set overall plot theme
ggplot(lincoln_temps) +
  aes(x = mean_temp, y = month_long) +
  geom_density_ridges(
    scale = 3, bandwidth = 3.4,
    rel_min_height = 0.01
  ) +
  scale_x_continuous(
    name = "mean temperature (°F)",
    expand = c(0, 0)
  ) +
  scale_y_discrete(
    name = NULL,
    expand = expansion(add = c(0.2, 2.6))
  ) +
  theme_minimal_grid()  # from cowplot

# Align y axis labels to grid lines
ggplot(lincoln_temps) +
  aes(x = mean_temp, y = month_long) +
  geom_density_ridges(
    scale = 3, bandwidth = 3.4,
    rel_min_height = 0.01
  ) +
  scale_x_continuous(
    name = "mean temperature (°F)",
    expand = c(0, 0)
  ) +
  scale_y_discrete(
    name = NULL,
    expand = expansion(add = c(0.2, 2.6))
  ) +
  theme_minimal_grid() +
  theme(
    axis.text.y = element_text(vjust = 0)
  )

# Change fill color from default gray to blue
ggplot(lincoln_temps) +
  aes(x = mean_temp, y = month_long) +
  geom_density_ridges(
    scale = 3, bandwidth = 3.4,
    rel_min_height = 0.01,
    fill = "#7DCCFF"
  ) +
  scale_x_continuous(
    name = "mean temperature (°F)",
    expand = c(0, 0)
  ) +
  scale_y_discrete(
    name = NULL,
    expand = expansion(add = c(0.2, 2.6))
  ) +
  theme_minimal_grid() +
  theme(
    axis.text.y = element_text(vjust = 0)
  )

# Draw lines in white instead of black
ggplot(lincoln_temps) +
  aes(x = mean_temp, y = month_long) +
  geom_density_ridges(
    scale = 3, bandwidth = 3.4,
    rel_min_height = 0.01,
    fill = "#7DCCFF",
    color = "white"
  ) +
  scale_x_continuous(
    name = "mean temperature (°F)",
    expand = c(0, 0)
  ) +
  scale_y_discrete(
    name = NULL,
    expand = expansion(add = c(0.2, 2.6))
  ) +
  theme_minimal_grid() +
  theme(
    axis.text.y = element_text(vjust = 0)  
    )

# Using ready-made themes
ggplot(penguins, aes(flipper_length_mm, body_mass_g, color = species)) +
  geom_point()

  # default theme is theme_gray()

ggplot(penguins, aes(flipper_length_mm, body_mass_g, color = species)) +
  geom_point() +
  theme_gray(14) # most themes take a font-size argument to scale text size

ggplot(penguins, aes(flipper_length_mm, body_mass_g, color = species)) +
  geom_point() +
  theme_minimal(14)

ggplot(penguins, aes(flipper_length_mm, body_mass_g, color = species)) +
  geom_point() +
  theme_classic(14)

ggplot(penguins, aes(flipper_length_mm, body_mass_g, color = species)) +
  geom_point() +
  theme_half_open()  # from package cowplot

ggplot(penguins, aes(flipper_length_mm, body_mass_g, color = species)) +
  geom_point() +
  theme_minimal_hgrid()  # from package cowplot

ggplot(penguins, aes(flipper_length_mm, body_mass_g, color = species)) +
  geom_point() +
  theme_economist(14) + scale_color_economist() # from package ggthemes

ggplot(penguins, aes(flipper_length_mm, body_mass_g, color = species)) +
  geom_point() +
  theme_fivethirtyeight(14) + scale_color_fivethirtyeight() # from package ggthemes

# Customizing theme elements
ggplot(penguins) +
  aes(flipper_length_mm, body_mass_g) +
  geom_point(aes(color = species)) +
  theme_minimal_grid() +
  theme(
    # change color of only the x axis title
    axis.title.x = element_text(
      color = "royalblue2"
    )
  )

ggplot(penguins) +
  aes(flipper_length_mm, body_mass_g) +
  geom_point(aes(color = species)) +
  theme_minimal_grid() +
  theme(
    # change all text colors?
    # why does it not work?
    text = element_text(color = "royalblue2")
  )

ggplot(penguins) +
  aes(flipper_length_mm, body_mass_g) +
  geom_point(aes(color = species)) +
  theme_minimal_grid() +
  theme(
    text = element_text(color = "royalblue2"),
    axis.text = element_text( # The element axis.text has its own color set in the theme. Therefore it doesn't inherit from text 
      color = "royalblue2"
    )
  )

# Horizontal and vertical alignment
ggplot(penguins) +
  aes(flipper_length_mm, body_mass_g) +
  geom_point(aes(color = species)) +
  theme_minimal_grid() +
  theme(
    axis.title.x = element_text(
      # horizontal justification
      # (0 = left)
      hjust = 0
    )
  )

ggplot(penguins) +
  aes(flipper_length_mm, body_mass_g) +
  geom_point(aes(color = species)) +
  theme_minimal_grid() +
  theme(
    axis.title.x = element_text(
      # horizontal justification
      # (0.5 = center)
      hjust = 0.5
    )
  )

ggplot(penguins) +
  aes(flipper_length_mm, body_mass_g) +
  geom_point(aes(color = species)) +
  theme_minimal_grid() +
  theme(
    axis.title.x = element_text(
      # horizontal justification
      # (1 = right)
      hjust = 1
    )
  )

ggplot(penguins) +
  aes(flipper_length_mm, body_mass_g) +
  geom_point(aes(color = species)) +
  theme_minimal_grid() +
  theme(
    axis.text.y = element_text(
      # vertical justification
      # (0 = bottom)
      vjust = 0
    )
  )

ggplot(penguins) +
  aes(flipper_length_mm, body_mass_g) +
  geom_point(aes(color = species)) +
  theme_minimal_grid() +
  theme(
    axis.text.y = element_text(
      # vertical justification
      # (0.5 = center)
      vjust = 0.5
    )
  )

ggplot(penguins) +
  aes(flipper_length_mm, body_mass_g) +
  geom_point(aes(color = species)) +
  theme_minimal_grid() +
  theme(
    axis.text.y = element_text(
      # vertical justification
      # (1 = top)
      vjust = 1
    )
  )

# Remove elements entirely: element_blank()
ggplot(penguins) +
  aes(flipper_length_mm, body_mass_g) +
  geom_point(aes(color = species)) +
  theme_minimal_grid() +
  theme(
    # all text gone
    text = element_blank()
  )

# Set background color: element_rect()
ggplot(penguins) +
  aes(flipper_length_mm, body_mass_g) +
  geom_point(aes(color = species)) +
  theme_minimal_grid() +
  theme(
    plot.background = element_rect(
      fill = "aliceblue"
    )
  )

ggplot(penguins) +
  aes(flipper_length_mm, body_mass_g) +
  geom_point(aes(color = species)) +
  theme_minimal_grid() +
  theme(
    panel.background = element_rect(
      fill = "aliceblue"
    )
  )

ggplot(penguins) +
  aes(flipper_length_mm, body_mass_g) +
  geom_point(aes(color = species)) +
  theme_minimal_grid() +
  theme(
    legend.box.background = element_rect(
      fill = "aliceblue",
      color = "steelblue4" # line color
    )
  )

ggplot(penguins) +
  aes(flipper_length_mm, body_mass_g) +
  geom_point(aes(color = species)) +
  theme_minimal_grid() +
  theme(
    legend.box.background = element_rect(
      fill = "aliceblue",
      color = "steelblue4" # line color
    ),
    legend.box.margin = margin(7, 7, 7, 7)
  )

# Move the legend: legend.position
ggplot(penguins) +
  aes(flipper_length_mm, body_mass_g) +
  geom_point(aes(color = species)) +
  theme_minimal_grid() +
  theme(
    legend.box.background = element_rect(
      fill = "aliceblue",
      color = "steelblue4" # line color
    ),
    legend.box.margin = margin(7, 7, 7, 7),
    # relative position inside plot panel
    legend.position = c(1, 0),
    # justification relative to position
    legend.justification = c(1, 0)
  )

4.7 Data wrangling

# Example application of grouping: Counting
penguins %>%
  group_by(species) %>%
  summarize(
    n = n()  # n() returns the number of observations per group
  )
species n
Adelie 152
Chinstrap 68
Gentoo 124
# group by multiple variables
penguins %>%
  group_by(species, island) %>%
  summarize(
    n = n()  # n() returns the number of observations per group
  )
species island n
Adelie Biscoe 44
Adelie Dream 56
Adelie Torgersen 52
Chinstrap Dream 68
Gentoo Biscoe 124
# count(...) is a short-cut for group_by(...) %>% summarize(n = n())
penguins %>%
  count(species, island)
species island n
Adelie Biscoe 44
Adelie Dream 56
Adelie Torgersen 52
Chinstrap Dream 68
Gentoo Biscoe 124
# Performing multiple summaries at once
penguins %>%
  group_by(species) %>%
  summarize(
    n = n(),                                      # number of penguins
    mean_mass = mean(body_mass_g, na.rm = T),                # mean body mass
    max_flipper_length = max(flipper_length_mm, na.rm = T),  # max flipper length
    percent_female = sum(sex == "female", na.rm = T) / sum(!is.na(sex))     # percent of female penguins
  )
species n mean_mass max_flipper_length percent_female
Adelie 152 3700.662 210 0.500000
Chinstrap 68 3733.088 212 0.500000
Gentoo 124 5076.016 231 0.487395
# Making a wide summary table
penguins_wide <- penguins %>%
  count(species, island) %>%
  pivot_wider(names_from = "island", values_from = "n")

# going back to long format
penguins_wide %>% 
  pivot_longer(cols = -species, names_to = "island", values_to = "n")
species island n
Adelie Biscoe 44
Adelie Dream 56
Adelie Torgersen 52
Chinstrap Biscoe NA
Chinstrap Dream 68
Chinstrap Torgersen NA
Gentoo Biscoe 124
Gentoo Dream NA
Gentoo Torgersen NA
# Column specifications work just like in select():
# specify columns by subtraction
penguins_wide %>% 
  pivot_longer(cols = -species, names_to = "island", values_to = "n")
species island n
Adelie Biscoe 44
Adelie Dream 56
Adelie Torgersen 52
Chinstrap Biscoe NA
Chinstrap Dream 68
Chinstrap Torgersen NA
Gentoo Biscoe 124
Gentoo Dream NA
Gentoo Torgersen NA
# specify columns by explicit listing
penguins_wide %>% 
  pivot_longer(cols = c(Biscoe, Dream, Torgersen), names_to = "island", values_to = "n")
species island n
Adelie Biscoe 44
Adelie Dream 56
Adelie Torgersen 52
Chinstrap Biscoe NA
Chinstrap Dream 68
Chinstrap Torgersen NA
Gentoo Biscoe 124
Gentoo Dream NA
Gentoo Torgersen NA
# specify columns by range
penguins_wide %>% 
  pivot_longer(cols = Biscoe:Torgersen, names_to = "island", values_to = "n")
species island n
Adelie Biscoe 44
Adelie Dream 56
Adelie Torgersen 52
Chinstrap Biscoe NA
Chinstrap Dream 68
Chinstrap Torgersen NA
Gentoo Biscoe 124
Gentoo Dream NA
Gentoo Torgersen NA
# Combine datasets: joins
band_members
name band
Mick Stones
John Beatles
Paul Beatles
band_instruments
name plays
John guitar
Paul bass
Keith guitar
left_join(band_members, band_instruments) # add right table to left; In case of doubt, use left_join()
name band plays
Mick Stones NA
John Beatles guitar
Paul Beatles bass
right_join(band_members, band_instruments) # add left table to right
name band plays
John Beatles guitar
Paul Beatles bass
Keith NA guitar
inner_join(band_members, band_instruments) # keep intersection only
name band plays
John Beatles guitar
Paul Beatles bass
full_join(band_members, band_instruments) # merge all cases
name band plays
Mick Stones NA
John Beatles guitar
Paul Beatles bass
Keith NA guitar

4.8 Getting things into the right order

# We can use fct_relevel() to manually order the bars in a bar plot
ggplot(penguins, aes(y = fct_relevel(species, "Chinstrap", "Gentoo", "Adelie"))) +
  geom_bar()

# Somewhat cleaner: mutate first, then plot
penguins %>%
  mutate(species = fct_relevel(species, "Chinstrap", "Gentoo", "Adelie")) %>%
  ggplot(aes(y = species)) +
  geom_bar()

# We order things in ggplot with factors
penguins %>%
  mutate(species = fct_relevel(species, "Chinstrap", "Gentoo", "Adelie")) %>% # ggplot generally places visual elements in the order defined by the levels 
  slice(1:30) %>%   # get first 30 rows
  pull(species)     # pull out just the `species` column
##  [1] Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie
## [11] Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie
## [21] Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie
## Levels: Chinstrap Gentoo Adelie
# The order of the y axis is from bottom to top
penguins %>%
  mutate(species = fct_relevel(species, "Chinstrap", "Gentoo", "Adelie")) %>%
  ggplot(aes(y = species)) +
  geom_bar()

# Reorder based on frequency: fct_infreq()
penguins %>%
  mutate(species = fct_infreq(species)) %>%
  slice(1:30) %>%   # get first 30 rows
  pull(species)     # pull out just the `species` column
##  [1] Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie
## [11] Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie
## [21] Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie Adelie
## Levels: Adelie Gentoo Chinstrap
penguins %>%
  mutate(species = fct_infreq(species)) %>%
  ggplot(aes(y = species)) +
  geom_bar()

# Reverse order: fct_rev()
penguins %>%
  mutate(species = fct_rev(fct_infreq(species))) %>%
  ggplot(aes(y = species)) + geom_bar()

# Reorder based on numeric values
penguins %>%
  count(species)
species n
Adelie 152
Chinstrap 68
Gentoo 124
penguins %>%
  count(species) %>%
  mutate(species = fct_reorder(species, n)) %>% # The order is ascending, from smallest to largest value 
  pull(species)
## [1] Adelie    Chinstrap Gentoo   
## Levels: Chinstrap Gentoo Adelie
penguins %>%
  count(species) %>%
  mutate(species = fct_reorder(species, n)) %>%
  ggplot(aes(species, n)) + geom_col()

penguins %>%
  count(species) %>% # summarize data
  mutate(species = fct_reorder(species, n)) %>%
  ggplot(aes(n, species)) + geom_col()

penguins %>% 
  # modify the original dataset, no summary
  mutate(species = fct_infreq(species)) %>%
  ggplot(aes(y = fct_rev(species))) + geom_bar()

# Default order is alphabetic, from bottom to top
gapminder %>%
  filter(
    year == 2007,
    continent == "Americas"
  ) %>%
  ggplot(aes(lifeExp, country)) + 
  geom_point()

gapminder %>%
  filter(
    year == 2007,
    continent == "Americas"
  ) %>%
  mutate(
    country = fct_reorder(country, lifeExp) # Order is ascending from bottom to top 
  ) %>%
  ggplot(aes(lifeExp, country)) + 
  geom_point()

# We can also order facets
gapminder %>%
  filter(country %in% c("Norway", "Portugal", "Spain", "Austria")) %>%
  ggplot(aes(year, lifeExp)) + geom_line() +
  facet_wrap(vars(country), nrow = 1)

# When the levels of a factor occur more than once, fct_reorder() applies a summary function
gapminder %>%
  filter(country %in% c("Norway", "Portugal", "Spain", "Austria")) %>%
  mutate(country = fct_reorder(country, lifeExp)) %>% # default: order by median
  ggplot(aes(year, lifeExp)) + geom_line() +
  facet_wrap(vars(country), nrow = 1)

# We can also set the summary function explicitly
gapminder %>%
  filter(country %in% c("Norway", "Portugal", "Spain", "Austria")) %>%
  mutate(country = fct_reorder(country, lifeExp, min)) %>% # order by minimum
  ggplot(aes(year, lifeExp)) + geom_line() +
  facet_wrap(vars(country), nrow = 1)

gapminder %>%
  filter(country %in% c("Norway", "Portugal", "Spain", "Austria")) %>%
  mutate(country = fct_reorder(country, lifeExp, max)) %>% # order by maximum
  ggplot(aes(year, lifeExp)) + geom_line() +
  facet_wrap(vars(country), nrow = 1)

gapminder %>%
  filter(country %in% c("Norway", "Portugal", "Spain", "Austria")) %>%
  # order by custom function: here, difference between max and min
  mutate(country = fct_reorder(country, lifeExp, function(x) { max(x) - min(x) })) %>%
  ggplot(aes(year, lifeExp)) + geom_line() +
  facet_wrap(vars(country), nrow = 1)

gapminder %>%
  filter(country %in% c("Norway", "Portugal", "Spain", "Austria")) %>%
  # order by custom function: here, difference between min and max
  mutate(country = fct_reorder(country, lifeExp, function(x) { min(x) - max(x) })) %>%
  ggplot(aes(year, lifeExp)) + geom_line() +
  facet_wrap(vars(country), nrow = 1)

flight_data <- flights %>% # take data on individual flights
  left_join(airlines) %>%  # add in full-length airline names
  select(name, carrier, flight, year, month, day, origin, dest) # pick columns of interest

# alphabetic ordering
flight_data %>%
  ggplot(aes(y = name)) + 
  geom_bar()

flight_data %>%
  mutate(
    name = fct_infreq(name)  # based on numeric values (ascending order)
  ) %>%
  ggplot(aes(y = fct_rev(name))) + # reverse order 
  geom_bar()

flight_data %>%
  mutate(
    # keep only the 7 most common airlines (lumping)
    name = fct_infreq(fct_lump_n(name, 7))
  ) %>%
  ggplot(aes(y = fct_rev(name))) + 
  geom_bar()

# In most cases, you will want to order before lumping
flight_data %>%
  mutate(
    # order before lumping
    name = fct_lump_n(fct_infreq(name), 7)
  ) %>%
  ggplot(aes(y = fct_rev(name))) + 
  geom_bar()

# separate visually categories
flight_data %>%
  mutate(
    name = fct_lump_n(fct_infreq(name), 7),
    # Use `fct_other()` to manually lump all
    # levels not called "Other" into "Named"
    highlight = fct_other(
      name,
      keep = "Other", other_level = "Named"
    )
  ) %>%
  ggplot() +
  aes(
    y = fct_rev(name),
    fill = highlight
  ) + 
  geom_bar()

# Put the legend in the right order
flight_data %>%
  mutate(
    name = fct_lump_n(fct_infreq(name), 7),
    # Use `fct_other()` to manually lump all
    # levels not called "Other" into "Named"
    highlight = fct_other(
      name,
      keep = "Other", other_level = "Named"
    )
  ) %>%
  ggplot() +
  aes(
    y = fct_rev(name),
    # reverse fill aesthetic
    fill = fct_rev(highlight)
  ) + 
  geom_bar()

# final version
flight_data %>%
  mutate(
    name = fct_lump_n(fct_infreq(name), 7),
    highlight = fct_other(
      name, keep = "Other", other_level = "Named"
    )
  ) %>%
  ggplot() +
  aes(y = fct_rev(name), fill = highlight) + 
  geom_bar() +
  scale_x_continuous(
    name = "Number of flights",
    expand = expansion(mult = c(0, 0.07))
  ) +
  scale_y_discrete(name = NULL) +
  scale_fill_manual(
    values = c(
      Named = "gray50", Other = "#98545F"
    ),
    guide = "none"
  ) +
  theme_minimal_vgrid()

4.9 Visualizing proportions

# Making pie charts with ggplot: polar coords
# the data
bundestag <- tibble(
  party = c("CDU/CSU", "SPD", "FDP"),
  seats = c(243, 214, 39)
)
# make bar chart in polar coords
ggplot(bundestag) +
  aes(seats, "YY", fill = party) + 
  geom_col() +
  coord_polar() +
  scale_x_continuous(
    name = NULL, breaks = NULL
  ) +
  scale_y_discrete(
    name = NULL, breaks = NULL
  ) +
  ggtitle("German Bundestag 1976-1980")

# Making pie charts with ggplot: ggforce stat pie
ggplot(bundestag) +
  aes(
    x0 = 0, y0 = 0, # position of pie center
    r0 = 0, r = 1,  # inner and outer radius
    amount = seats, # size of pie slices
    fill = party
  ) + 
  geom_arc_bar(stat = "pie") +
  coord_fixed()

ggplot(bundestag) +
  aes(
    x0 = 1, y0 = 1, # position of pie center
    r0 = 1, r = 2,  # inner and outer radius
    amount = seats, # size of pie slices
    fill = party
  ) + 
  geom_arc_bar(stat = "pie") +
  coord_fixed(
    xlim = c(-1, 3), ylim = c(-1, 3)
  )

# Making pie charts with ggplot: ggforce manual comp.
# prepare pie data
pie_data <- bundestag %>%
  arrange(seats) # sort so pie slices end up sorted
pie_data
party seats
FDP 39
SPD 214
CDU/CSU 243
pie_data <- bundestag %>%
  arrange(seats) %>% # sort so pie slices end up sorted
  mutate(
    end_angle = 2*pi*cumsum(seats)/sum(seats),   # ending angle for each pie slice
    start_angle = lag(end_angle, default = 0),   # starting angle for each pie slice
    mid_angle = 0.5*(start_angle + end_angle),   # middle of each pie slice, for text labels
    # horizontal and vertical justifications for outer labels
    hjust = ifelse(mid_angle > pi, 1, 0),
    vjust = ifelse(mid_angle < pi/2 | mid_angle > 3*pi/2, 0, 1)
  )
pie_data
party seats end_angle start_angle mid_angle hjust vjust
FDP 39 0.4940408 0.0000000 0.2470204 0 0
SPD 214 3.2049312 0.4940408 1.8494860 0 1
CDU/CSU 243 6.2831853 3.2049312 4.7440583 1 0
ggplot(pie_data) +
  aes(
    x0 = 0, y0 = 0, r0 = 0, r = 1,
    start = start_angle, end = end_angle,
    fill = party
  ) +
  geom_arc_bar() +
  geom_text( # place amounts inside the pie
    aes(
      x = 0.6 * sin(mid_angle),
      y = 0.6 * cos(mid_angle),
      label = seats
    )
  ) +
  coord_fixed()

ggplot(pie_data) +
  aes(
    x0 = 0, y0 = 0, r0 = 0, r = 1,
    start = start_angle, end = end_angle,
    fill = party
  ) +
  geom_arc_bar() +
  geom_text( # place amounts inside the pie
    aes(
      x = 0.6 * sin(mid_angle),
      y = 0.6 * cos(mid_angle),
      label = seats
    )
  ) +
  geom_text( # place party name outside the pie
    aes(
      x = 1.05 * sin(mid_angle),
      y = 1.05 * cos(mid_angle),
      label = party,
      hjust = hjust, vjust = vjust
    )
  ) +
  coord_fixed()

ggplot(pie_data) +
  aes(
    x0 = 0, y0 = 0, r0 = 0, r = 1,
    start = start_angle, end = end_angle,
    fill = party
  ) +
  geom_arc_bar() +
  geom_text( # place amounts inside the pie
    aes(
      x = 0.6 * sin(mid_angle),
      y = 0.6 * cos(mid_angle),
      label = seats
    )
  ) +
  geom_text( # place party name outside the pie
    aes(
      x = 1.05 * sin(mid_angle),
      y = 1.05 * cos(mid_angle),
      label = party,
      hjust = hjust, vjust = vjust
    )
  ) +
  coord_fixed(xlim = c(-1.8, 1.3))

ggplot(pie_data) +
  aes(
    x0 = 0, y0 = 0, r0 = 0.4, r = 1,
    start = start_angle, end = end_angle,
    fill = party
  ) +
  geom_arc_bar() +
  geom_text( # place amounts inside the pie
    aes(
      x = 0.7 * sin(mid_angle),
      y = 0.7 * cos(mid_angle),
      label = seats
    )
  ) +
  geom_text( # place party name outside the pie
    aes(
      x = 1.05 * sin(mid_angle),
      y = 1.05 * cos(mid_angle),
      label = party,
      hjust = hjust, vjust = vjust
    )
  ) +
  coord_fixed(xlim = c(-1.8, 1.3))

4.11 Working with models

penguins %>%
  ggplot(aes(body_mass_g, flipper_length_mm)) +
  geom_point() +
  geom_smooth(method = "lm", se = FALSE) +
  facet_wrap(vars(species))

# We can fit a linear model with lm()
penguins_adelie <- filter(penguins, species == "Adelie")
lm_out <- lm(flipper_length_mm ~ body_mass_g, data = penguins_adelie)
summary(lm_out)
## 
## Call:
## lm(formula = flipper_length_mm ~ body_mass_g, data = penguins_adelie)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -14.2769  -3.6192   0.0569   3.4696  18.0477 
## 
## Coefficients:
##               Estimate Std. Error t value             Pr(>|t|)    
## (Intercept) 165.244813   3.849281  42.929 < 0.0000000000000002 ***
## body_mass_g   0.006677   0.001032   6.468        0.00000000134 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 5.798 on 149 degrees of freedom
##   (1 observation deleted due to missingness)
## Multiple R-squared:  0.2192, Adjusted R-squared:  0.214 
## F-statistic: 41.83 on 1 and 149 DF,  p-value: 0.000000001343
# Use map() to fit models to groups of data
lm_data <- penguins %>%
  nest(data = -species) %>% # nest all data except species column
  mutate(
    # apply linear model to each nested data frame
    fit = map(data, ~lm(flipper_length_mm ~ body_mass_g, data = .x))
  )
lm_data$fit[[1]] 
## 
## Call:
## lm(formula = flipper_length_mm ~ body_mass_g, data = .x)
## 
## Coefficients:
## (Intercept)  body_mass_g  
##  165.244813     0.006677
summary(lm_data$fit[[1]]) # summarize the first model, which is for Adelie
## 
## Call:
## lm(formula = flipper_length_mm ~ body_mass_g, data = .x)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -14.2769  -3.6192   0.0569   3.4696  18.0477 
## 
## Coefficients:
##               Estimate Std. Error t value             Pr(>|t|)    
## (Intercept) 165.244813   3.849281  42.929 < 0.0000000000000002 ***
## body_mass_g   0.006677   0.001032   6.468        0.00000000134 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 5.798 on 149 degrees of freedom
##   (1 observation deleted due to missingness)
## Multiple R-squared:  0.2192, Adjusted R-squared:  0.214 
## F-statistic: 41.83 on 1 and 149 DF,  p-value: 0.000000001343
summary(lm_data$fit[[2]]) # summarize the second model, which is for Chinstrap
## 
## Call:
## lm(formula = flipper_length_mm ~ body_mass_g, data = .x)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -12.0194  -2.7401   0.1781   2.9859   8.9806 
## 
## Coefficients:
##                Estimate  Std. Error t value            Pr(>|t|)    
## (Intercept) 171.3041886   4.2443258   40.36 <0.0000000000000002 ***
## body_mass_g   0.0090391   0.0008321   10.86 <0.0000000000000002 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.633 on 121 degrees of freedom
##   (1 observation deleted due to missingness)
## Multiple R-squared:  0.4937, Adjusted R-squared:  0.4896 
## F-statistic:   118 on 1 and 121 DF,  p-value: < 0.00000000000000022
summary(lm_data$fit[[3]]) # summarize the third model, which is for Gento
## 
## Call:
## lm(formula = flipper_length_mm ~ body_mass_g, data = .x)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -14.4296  -3.3315   0.4097   2.8889  11.5941 
## 
## Coefficients:
##               Estimate Std. Error t value             Pr(>|t|)    
## (Intercept) 151.380874   6.574823  23.024 < 0.0000000000000002 ***
## body_mass_g   0.011905   0.001752   6.795        0.00000000375 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 5.512 on 66 degrees of freedom
## Multiple R-squared:  0.4116, Adjusted R-squared:  0.4027 
## F-statistic: 46.17 on 1 and 66 DF,  p-value: 0.000000003748
glance(lm_out) # provides model-wide summary estimates in tidy format
r.squared adj.r.squared sigma statistic p.value df logLik AIC BIC deviance df.residual nobs
0.2192128 0.2139726 5.797764 41.83305 0 1 -478.6314 963.2627 972.3146 5008.496 149 151
tidy(lm_out) # provides information about regression coefficients in tidy format
term estimate std.error statistic p.value
(Intercept) 165.2448126 3.8492806 42.928752 0
body_mass_g 0.0066769 0.0010323 6.467848 0
# Apply these functions to multiple models with map()
lm_summary <- penguins %>%
  nest(data = -species) %>%
  mutate(
    fit = map(data, ~lm(flipper_length_mm ~ body_mass_g, data = .x)),
    glance_out = map(fit, glance)
  ) %>%
  select(species, glance_out) %>%
  unnest(cols = glance_out)
lm_summary
species r.squared adj.r.squared sigma statistic p.value df logLik AIC BIC deviance df.residual nobs
Adelie 0.2192128 0.2139726 5.797764 41.83305 0 1 -478.6314 963.2627 972.3146 5008.496 149 151
Gentoo 0.4937402 0.4895563 4.633213 118.00774 0 1 -362.1110 730.2221 738.6587 2597.467 121 123
Chinstrap 0.4115985 0.4026833 5.511975 46.16830 0 1 -211.5436 429.0872 435.7457 2005.203 66 68
# Make label data
label_data <- lm_summary %>%
  mutate(
    rsqr = signif(r.squared, 2),  # round to 2 significant digits
    pval = signif(p.value, 2),
    label = glue("R^2 = {rsqr}, P = {pval}"),
    body_mass_g = 6400, flipper_length_mm = 175 # label position in plot
  ) %>%
  select(species, label, body_mass_g, flipper_length_mm)
label_data
species label body_mass_g flipper_length_mm
Adelie R^2 = 0.22, P = 0.0000000013 6400 175
Gentoo R^2 = 0.49, P = 0.00000000000000000013 6400 175
Chinstrap R^2 = 0.41, P = 0.0000000037 6400 175
# Plotting
ggplot(penguins, aes(body_mass_g, flipper_length_mm)) + geom_point() +
  geom_text(
    data = label_data, aes(label = label),
    size = 10/.pt, hjust = 1  # 10pt, right-justified
  ) +
  geom_smooth(method = "lm", se = FALSE) + facet_wrap(vars(species))

4.12 Visualizing uncertainty

# Making a plot with error bars in R
lm_data <- gapminder %>%
  nest(data = -c(continent, year))
lm_data
continent year data
Asia 1952 1.0000, 8.0000, 9.0000, 19.0000, 25.0000, 56.0000, 59.0000, 60.0000, 61.0000, 62.0000, 64.0000, 67.0000, 68.0000, 70.0000, 71.0000, 72.0000, 73.0000, 79.0000, 84.0000, 88.0000, 90.0000, 97.0000, 98.0000, 102.0000, 110.0000, 114.0000, 120.0000, 125.0000, 126.0000, 128.0000, 138.0000, 139.0000, 140.0000, 28.8010, 50.9390, 37.4840, 39.4170, 44.0000, 60.9600, 37.3730, 37.4680, 44.8690, 45.3200, 65.3900, 63.0300, 43.1580, 50.0560, 47.4530, 55.5650, 55.9280, 48.4630, 42.2440, 36.3190, 36.1570, 37.5780, 43.4360, 47.7520, 39.8750, 60.3960, 57.5930, 45.8830, 58.5000, 50.8480, 40.4120, 43.1600, 32.5480, 8425333.0000, 120447.0000, 46886859.0000, 4693836.0000, 556263527.0000, 2125900.0000, 372000000.0000, 82052000.0000, 17272000.0000, 5441766.0000, 1620914.0000, 86459025.0000, 607914.0000, 8865488.0000, 20947571.0000, 160000.0000, 1439529.0000, 6748378.0000, 800663.0000, 20092996.0000, 9182536.0000, 507833.0000, 41346560.0000, 22438691.0000, 4005677.0000, 1127000.0000, 7982342.0000, 3661549.0000, 8550362.0000, 21289402.0000, 26246839.0000, 1030585.0000, 4963829.0000, 779.4453, 9867.0848, 684.2442, 368.4693, 400.4486, 3054.4212, 546.5657, 749.6817, 3035.3260, 4129.7661, 4086.5221, 3216.9563, 1546.9078, 1088.2778, 1030.5922, 108382.3529, 4834.8041, 1831.1329, 786.5669, 331.0000, 545.8657, 1828.2303, 684.5971, 1272.8810, 6459.5548, 2315.1382, 1083.5320, 1643.4854, 1206.9479, 757.7974, 605.0665, 1515.5923, 781.7176
Asia 1957 1.00000, 8.00000, 9.00000, 19.00000, 25.00000, 56.00000, 59.00000, 60.00000, 61.00000, 62.00000, 64.00000, 67.00000, 68.00000, 70.00000, 71.00000, 72.00000, 73.00000, 79.00000, 84.00000, 88.00000, 90.00000, 97.00000, 98.00000, 102.00000, 110.00000, 114.00000, 120.00000, 125.00000, 126.00000, 128.00000, 138.00000, 139.00000, 140.00000, 30.33200, 53.83200, 39.34800, 41.36600, 50.54896, 64.75000, 40.24900, 39.91800, 47.18100, 48.43700, 67.84000, 65.50000, 45.66900, 54.08100, 52.68100, 58.03300, 59.48900, 52.10200, 45.24800, 41.90500, 37.68600, 40.08000, 45.55700, 51.33400, 42.86800, 63.17900, 61.45600, 48.28400, 62.40000, 53.63000, 42.88700, 45.67100, 33.97000, 9240934.00000, 138655.00000, 51365468.00000, 5322536.00000, 637408000.00000, 2736300.00000, 409000000.00000, 90124000.00000, 19792000.00000, 6248643.00000, 1944401.00000, 91563009.00000, 746559.00000, 9411381.00000, 22611552.00000, 212846.00000, 1647412.00000, 7739235.00000, 882134.00000, 21731844.00000, 9682338.00000, 561977.00000, 46679944.00000, 26072194.00000, 4419650.00000, 1445929.00000, 9128546.00000, 4149908.00000, 10164215.00000, 25041917.00000, 28998543.00000, 1070439.00000, 5498090.00000, 820.85303, 11635.79945, 661.63746, 434.03834, 575.98700, 3629.07646, 590.06200, 858.90027, 3290.25764, 6229.33356, 5385.27845, 4317.69437, 1886.08059, 1571.13466, 1487.59354, 113523.13290, 6089.78693, 1810.06699, 912.66261, 350.00000, 597.93636, 2242.74655, 747.08353, 1547.94484, 8157.59125, 2843.10441, 1072.54660, 2117.23489, 1507.86129, 793.57741, 676.28545, 1827.06774, 804.83045
Asia 1962 1.00000, 8.00000, 9.00000, 19.00000, 25.00000, 56.00000, 59.00000, 60.00000, 61.00000, 62.00000, 64.00000, 67.00000, 68.00000, 70.00000, 71.00000, 72.00000, 73.00000, 79.00000, 84.00000, 88.00000, 90.00000, 97.00000, 98.00000, 102.00000, 110.00000, 114.00000, 120.00000, 125.00000, 126.00000, 128.00000, 138.00000, 139.00000, 140.00000, 31.99700, 56.92300, 41.21600, 43.41500, 44.50136, 67.65000, 43.60500, 42.51800, 49.32500, 51.45700, 69.39000, 68.73000, 48.12600, 56.65600, 55.29200, 60.47000, 62.09400, 55.73700, 48.25100, 45.10800, 39.39300, 43.16500, 47.67000, 54.75700, 45.91400, 65.79800, 62.19200, 50.30500, 65.20000, 56.06100, 45.36300, 48.12700, 35.18000, 10267083.00000, 171863.00000, 56839289.00000, 6083619.00000, 665770000.00000, 3305200.00000, 454000000.00000, 99028000.00000, 22874000.00000, 7240260.00000, 2310904.00000, 95831757.00000, 933559.00000, 10917494.00000, 26420307.00000, 358266.00000, 1886848.00000, 8906385.00000, 1010280.00000, 23634436.00000, 10332057.00000, 628164.00000, 53100671.00000, 30325264.00000, 4943029.00000, 1750200.00000, 10421936.00000, 4834621.00000, 11918938.00000, 29263397.00000, 33796140.00000, 1133134.00000, 6120081.00000, 853.10071, 12753.27514, 686.34155, 496.91365, 487.67402, 4692.64827, 658.34715, 849.28977, 4187.32980, 8341.73782, 7105.63071, 6576.64946, 2348.00916, 1621.69360, 1536.34439, 95458.11176, 5714.56061, 2036.88494, 1056.35396, 388.00000, 652.39686, 2924.63811, 803.34274, 1649.55215, 11626.41975, 3674.73557, 1074.47196, 2193.03713, 1822.87903, 1002.19917, 772.04916, 2198.95631, 825.62320
Asia 1967 1.00000, 8.00000, 9.00000, 19.00000, 25.00000, 56.00000, 59.00000, 60.00000, 61.00000, 62.00000, 64.00000, 67.00000, 68.00000, 70.00000, 71.00000, 72.00000, 73.00000, 79.00000, 84.00000, 88.00000, 90.00000, 97.00000, 98.00000, 102.00000, 110.00000, 114.00000, 120.00000, 125.00000, 126.00000, 128.00000, 138.00000, 139.00000, 140.00000, 34.02000, 59.92300, 43.45300, 45.41500, 58.38112, 70.00000, 47.19300, 45.96400, 52.46900, 54.45900, 70.75000, 71.43000, 51.62900, 59.94200, 57.71600, 64.62400, 63.87000, 59.37100, 51.25300, 49.37900, 41.47200, 46.98800, 49.80000, 56.39300, 49.90100, 67.94600, 64.26600, 53.65500, 67.50000, 58.28500, 47.83800, 51.63100, 36.98400, 11537966.00000, 202182.00000, 62821884.00000, 6960067.00000, 754550000.00000, 3722800.00000, 506000000.00000, 109343000.00000, 26538000.00000, 8519282.00000, 2693585.00000, 100825279.00000, 1255058.00000, 12617009.00000, 30131000.00000, 575003.00000, 2186894.00000, 10154878.00000, 1149500.00000, 25870271.00000, 11261690.00000, 714775.00000, 60641899.00000, 35356600.00000, 5618198.00000, 1977600.00000, 11737396.00000, 5680812.00000, 13648692.00000, 34024249.00000, 39463910.00000, 1142636.00000, 6740785.00000, 836.19714, 14804.67270, 721.18609, 523.43231, 612.70569, 6197.96281, 700.77061, 762.43177, 5906.73181, 8931.45981, 8393.74140, 9847.78861, 2741.79625, 2143.54061, 2029.22814, 80894.88326, 6006.98304, 2277.74240, 1226.04113, 349.00000, 676.44223, 4720.94269, 942.40826, 1814.12743, 16903.04886, 4977.41854, 1135.51433, 1881.92363, 2643.85868, 1295.46066, 637.12329, 2649.71501, 862.44215
Asia 1972 1.00000, 8.00000, 9.00000, 19.00000, 25.00000, 56.00000, 59.00000, 60.00000, 61.00000, 62.00000, 64.00000, 67.00000, 68.00000, 70.00000, 71.00000, 72.00000, 73.00000, 79.00000, 84.00000, 88.00000, 90.00000, 97.00000, 98.00000, 102.00000, 110.00000, 114.00000, 120.00000, 125.00000, 126.00000, 128.00000, 138.00000, 139.00000, 140.00000, 36.08800, 63.30000, 45.25200, 40.31700, 63.11888, 72.00000, 50.65100, 49.20300, 55.23400, 56.95000, 71.63000, 73.42000, 56.52800, 63.98300, 62.61200, 67.71200, 65.42100, 63.01000, 53.75400, 53.07000, 43.97100, 52.14300, 51.92900, 58.06500, 53.88600, 69.52100, 65.04200, 57.29600, 69.39000, 60.40500, 50.25400, 56.53200, 39.84800, 13079460.00000, 230800.00000, 70759295.00000, 7450606.00000, 862030000.00000, 4115700.00000, 567000000.00000, 121282000.00000, 30614000.00000, 10061506.00000, 3095893.00000, 107188273.00000, 1613551.00000, 14781241.00000, 33505000.00000, 841934.00000, 2680018.00000, 11441462.00000, 1320500.00000, 28466390.00000, 12412593.00000, 829050.00000, 69325921.00000, 40850141.00000, 6472756.00000, 2152400.00000, 13016733.00000, 6701172.00000, 15226039.00000, 39276153.00000, 44655014.00000, 1089572.00000, 7407075.00000, 739.98111, 18268.65839, 630.23363, 421.62403, 676.90009, 8315.92814, 724.03253, 1111.10791, 9613.81861, 9576.03760, 12786.93223, 14778.78636, 2110.85631, 3701.62150, 3030.87665, 109347.86700, 7486.38434, 2849.09478, 1421.74197, 357.00000, 674.78813, 10618.03855, 1049.93898, 1989.37407, 24837.42865, 8597.75620, 1213.39553, 2571.42301, 4062.52390, 1524.35894, 699.50164, 3133.40928, 1265.04703
Asia 1977 1.00000, 8.00000, 9.00000, 19.00000, 25.00000, 56.00000, 59.00000, 60.00000, 61.00000, 62.00000, 64.00000, 67.00000, 68.00000, 70.00000, 71.00000, 72.00000, 73.00000, 79.00000, 84.00000, 88.00000, 90.00000, 97.00000, 98.00000, 102.00000, 110.00000, 114.00000, 120.00000, 125.00000, 126.00000, 128.00000, 138.00000, 139.00000, 140.00000, 38.43800, 65.59300, 46.92300, 31.22000, 63.96736, 73.60000, 54.20800, 52.70200, 57.70200, 60.41300, 73.06000, 75.38000, 61.13400, 67.15900, 64.76600, 69.34300, 66.09900, 65.25600, 55.49100, 56.05900, 46.74800, 57.36700, 54.04300, 60.06000, 58.69000, 70.79500, 65.94900, 61.19500, 70.59000, 62.49400, 55.76400, 60.76500, 44.17500, 14880372.00000, 297410.00000, 80428306.00000, 6978607.00000, 943455000.00000, 4583700.00000, 634000000.00000, 136725000.00000, 35480679.00000, 11882916.00000, 3495918.00000, 113872473.00000, 1937652.00000, 16325320.00000, 36436000.00000, 1140357.00000, 3115787.00000, 12845381.00000, 1528000.00000, 31528087.00000, 13933198.00000, 1004533.00000, 78152686.00000, 46850962.00000, 8128505.00000, 2325300.00000, 14116836.00000, 7932503.00000, 16785196.00000, 44148285.00000, 50533506.00000, 1261091.00000, 8403990.00000, 786.11336, 19340.10196, 659.87723, 524.97218, 741.23747, 11186.14125, 813.33732, 1382.70206, 11888.59508, 14688.23507, 13306.61921, 16610.37701, 2852.35157, 4106.30125, 4657.22102, 59265.47714, 8659.69684, 3827.92157, 1647.51166, 371.00000, 694.11244, 11848.34392, 1175.92119, 2373.20429, 34167.76260, 11210.08948, 1348.77565, 3195.48458, 5596.51983, 1961.22464, 713.53712, 3682.83149, 1829.76518
Asia 1982 1.0000, 8.0000, 9.0000, 19.0000, 25.0000, 56.0000, 59.0000, 60.0000, 61.0000, 62.0000, 64.0000, 67.0000, 68.0000, 70.0000, 71.0000, 72.0000, 73.0000, 79.0000, 84.0000, 88.0000, 90.0000, 97.0000, 98.0000, 102.0000, 110.0000, 114.0000, 120.0000, 125.0000, 126.0000, 128.0000, 138.0000, 139.0000, 140.0000, 39.8540, 69.0520, 50.0090, 50.9570, 65.5250, 75.4500, 56.5960, 56.1590, 59.6200, 62.0380, 74.4500, 77.1100, 63.7390, 69.1000, 67.1230, 71.3090, 66.9830, 68.0000, 57.4890, 58.0560, 49.5940, 62.7280, 56.1580, 62.0820, 63.0120, 71.7600, 68.7570, 64.5900, 72.1600, 64.5970, 58.8160, 64.4060, 49.1130, 12881816.0000, 377967.0000, 93074406.0000, 7272485.0000, 1000281000.0000, 5264500.0000, 708000000.0000, 153343000.0000, 43072751.0000, 14173318.0000, 3858421.0000, 118454974.0000, 2347031.0000, 17647518.0000, 39326000.0000, 1497494.0000, 3086876.0000, 14441916.0000, 1756032.0000, 34680442.0000, 15796314.0000, 1301048.0000, 91462088.0000, 53456774.0000, 11254672.0000, 2651869.0000, 15410151.0000, 9410494.0000, 18501390.0000, 48827160.0000, 56142181.0000, 1425876.0000, 9657618.0000, 978.0114, 19211.1473, 676.9819, 624.4755, 962.4214, 14560.5305, 855.7235, 1516.8730, 7608.3346, 14517.9071, 15367.0292, 19384.1057, 4161.4160, 4106.5253, 5622.9425, 31354.0357, 7640.5195, 4920.3560, 2000.6031, 424.0000, 718.3731, 12954.7910, 1443.4298, 2603.2738, 33693.1753, 15169.1611, 1648.0798, 3761.8377, 7426.3548, 2393.2198, 707.2358, 4336.0321, 1977.5570
Asia 1987 1.0000, 8.0000, 9.0000, 19.0000, 25.0000, 56.0000, 59.0000, 60.0000, 61.0000, 62.0000, 64.0000, 67.0000, 68.0000, 70.0000, 71.0000, 72.0000, 73.0000, 79.0000, 84.0000, 88.0000, 90.0000, 97.0000, 98.0000, 102.0000, 110.0000, 114.0000, 120.0000, 125.0000, 126.0000, 128.0000, 138.0000, 139.0000, 140.0000, 40.8220, 70.7500, 52.8190, 53.9140, 67.2740, 76.2000, 58.5530, 60.1370, 63.0400, 65.0440, 75.6000, 78.6700, 65.8690, 70.6470, 69.8100, 74.1740, 67.9260, 69.5000, 60.2220, 58.3390, 52.5370, 67.7340, 58.2450, 64.1510, 66.2950, 73.5600, 69.0110, 66.9740, 73.4000, 66.0840, 62.8200, 67.0460, 52.9220, 13867957.0000, 454612.0000, 103764241.0000, 8371791.0000, 1084035000.0000, 5584510.0000, 788000000.0000, 169276000.0000, 51889696.0000, 16543189.0000, 4203148.0000, 122091325.0000, 2820042.0000, 19067554.0000, 41622000.0000, 1891487.0000, 3089353.0000, 16331785.0000, 2015133.0000, 38028578.0000, 17917180.0000, 1593882.0000, 105186881.0000, 60017788.0000, 14619745.0000, 2794552.0000, 16495304.0000, 11242847.0000, 19757799.0000, 52910342.0000, 62826491.0000, 1691210.0000, 11219340.0000, 852.3959, 18524.0241, 751.9794, 683.8956, 1378.9040, 20038.4727, 976.5127, 1748.3570, 6642.8814, 11643.5727, 17122.4799, 22375.9419, 4448.6799, 4106.4923, 8533.0888, 28118.4300, 5377.0913, 5249.8027, 2338.0083, 385.0000, 775.6325, 18115.2231, 1704.6866, 2189.6350, 21198.2614, 18861.5308, 1876.7668, 3116.7743, 11054.5618, 2982.6538, 820.7994, 5107.1974, 1971.7415
Asia 1992 1.0000, 8.0000, 9.0000, 19.0000, 25.0000, 56.0000, 59.0000, 60.0000, 61.0000, 62.0000, 64.0000, 67.0000, 68.0000, 70.0000, 71.0000, 72.0000, 73.0000, 79.0000, 84.0000, 88.0000, 90.0000, 97.0000, 98.0000, 102.0000, 110.0000, 114.0000, 120.0000, 125.0000, 126.0000, 128.0000, 138.0000, 139.0000, 140.0000, 41.6740, 72.6010, 56.0180, 55.8030, 68.6900, 77.6010, 60.2230, 62.6810, 65.7420, 59.4610, 76.9300, 79.3600, 68.0150, 69.9780, 72.2440, 75.1900, 69.2920, 70.6930, 61.2710, 59.3200, 55.7270, 71.1970, 60.8380, 66.4580, 68.7680, 75.7880, 70.3790, 69.2490, 74.2600, 67.2980, 67.6620, 69.7180, 55.5990, 16317921.0000, 529491.0000, 113704579.0000, 10150094.0000, 1164970000.0000, 5829696.0000, 872000000.0000, 184816000.0000, 60397973.0000, 17861905.0000, 4936550.0000, 124329269.0000, 3867409.0000, 20711375.0000, 43805450.0000, 1418095.0000, 3219994.0000, 18319502.0000, 2312802.0000, 40546538.0000, 20326209.0000, 1915208.0000, 120065004.0000, 67185766.0000, 16945857.0000, 3235865.0000, 17587060.0000, 13219062.0000, 20686918.0000, 56667095.0000, 69940728.0000, 2104779.0000, 13367997.0000, 649.3414, 19035.5792, 837.8102, 682.3032, 1655.7842, 24757.6030, 1164.4068, 2383.1409, 7235.6532, 3745.6407, 18051.5225, 26824.8951, 3431.5936, 3726.0635, 12104.2787, 34932.9196, 6890.8069, 7277.9128, 1785.4020, 347.0000, 897.7404, 18616.7069, 1971.8295, 2279.3240, 24841.6178, 24769.8912, 2153.7392, 3340.5428, 15215.6579, 4616.8965, 989.0231, 6017.6548, 1879.4967
Asia 1997 1.0000, 8.0000, 9.0000, 19.0000, 25.0000, 56.0000, 59.0000, 60.0000, 61.0000, 62.0000, 64.0000, 67.0000, 68.0000, 70.0000, 71.0000, 72.0000, 73.0000, 79.0000, 84.0000, 88.0000, 90.0000, 97.0000, 98.0000, 102.0000, 110.0000, 114.0000, 120.0000, 125.0000, 126.0000, 128.0000, 138.0000, 139.0000, 140.0000, 41.7630, 73.9250, 59.4120, 56.5340, 70.4260, 80.0000, 61.7650, 66.0410, 68.0420, 58.8110, 78.2690, 80.6900, 69.7720, 67.7270, 74.6470, 76.1560, 70.2650, 71.9380, 63.6250, 60.3280, 59.4260, 72.4990, 61.8180, 68.5640, 70.5330, 77.1580, 70.4570, 71.5270, 75.2500, 67.5210, 70.6720, 71.0960, 58.0200, 22227415.0000, 598561.0000, 123315288.0000, 11782962.0000, 1230075000.0000, 6495918.0000, 959000000.0000, 199278000.0000, 63327987.0000, 20775703.0000, 5531387.0000, 125956499.0000, 4526235.0000, 21585105.0000, 46173816.0000, 1765345.0000, 3430388.0000, 20476091.0000, 2494803.0000, 43247867.0000, 23001113.0000, 2283635.0000, 135564834.0000, 75012988.0000, 21229759.0000, 3802309.0000, 18698655.0000, 15081016.0000, 21628605.0000, 60216677.0000, 76048996.0000, 2826046.0000, 15826497.0000, 635.3414, 20292.0168, 972.7700, 734.2852, 2289.2341, 28377.6322, 1458.8174, 3119.3356, 8263.5903, 3076.2398, 20896.6092, 28816.5850, 3645.3796, 1690.7568, 15993.5280, 40300.6200, 8754.9639, 10132.9096, 1902.2521, 415.0000, 1010.8921, 19702.0558, 2049.3505, 2536.5349, 20586.6902, 33519.4766, 2664.4773, 4014.2390, 20206.8210, 5852.6255, 1385.8968, 7110.6676, 2117.4845
Asia 2002 1.0000, 8.0000, 9.0000, 19.0000, 25.0000, 56.0000, 59.0000, 60.0000, 61.0000, 62.0000, 64.0000, 67.0000, 68.0000, 70.0000, 71.0000, 72.0000, 73.0000, 79.0000, 84.0000, 88.0000, 90.0000, 97.0000, 98.0000, 102.0000, 110.0000, 114.0000, 120.0000, 125.0000, 126.0000, 128.0000, 138.0000, 139.0000, 140.0000, 42.1290, 74.7950, 62.0130, 56.7520, 72.0280, 81.4950, 62.8790, 68.5880, 69.4510, 57.0460, 79.6960, 82.0000, 71.2630, 66.6620, 77.0450, 76.9040, 71.0280, 73.0440, 65.0330, 59.9080, 61.3400, 74.1930, 63.6100, 70.3030, 71.6260, 78.7700, 70.8150, 73.0530, 76.9900, 68.5640, 73.0170, 72.3700, 60.3080, 25268405.0000, 656397.0000, 135656790.0000, 12926707.0000, 1280400000.0000, 6762476.0000, 1034172547.0000, 211060000.0000, 66907826.0000, 24001816.0000, 6029529.0000, 127065841.0000, 5307470.0000, 22215365.0000, 47969150.0000, 2111561.0000, 3677780.0000, 22662365.0000, 2674234.0000, 45598081.0000, 25873917.0000, 2713462.0000, 153403524.0000, 82995088.0000, 24501530.0000, 4197776.0000, 19576783.0000, 17155814.0000, 22454239.0000, 62806748.0000, 80908147.0000, 3389578.0000, 18701257.0000, 726.7341, 23403.5593, 1136.3904, 896.2260, 3119.2809, 30209.0152, 1746.7695, 2873.9129, 9240.7620, 4390.7173, 21905.5951, 28604.5919, 3844.9172, 1646.7582, 19233.9882, 35110.1057, 9313.9388, 10206.9779, 2140.7393, 611.0000, 1057.2063, 19774.8369, 2092.7124, 2650.9211, 19014.5412, 36023.1054, 3015.3788, 4090.9253, 23235.4233, 5913.1875, 1764.4567, 4515.4876, 2234.8208
Asia 2007 1.0000, 8.0000, 9.0000, 19.0000, 25.0000, 56.0000, 59.0000, 60.0000, 61.0000, 62.0000, 64.0000, 67.0000, 68.0000, 70.0000, 71.0000, 72.0000, 73.0000, 79.0000, 84.0000, 88.0000, 90.0000, 97.0000, 98.0000, 102.0000, 110.0000, 114.0000, 120.0000, 125.0000, 126.0000, 128.0000, 138.0000, 139.0000, 140.0000, 43.8280, 75.6350, 64.0620, 59.7230, 72.9610, 82.2080, 64.6980, 70.6500, 70.9640, 59.5450, 80.7450, 82.6030, 72.5350, 67.2970, 78.6230, 77.5880, 71.9930, 74.2410, 66.8030, 62.0690, 63.7850, 75.6400, 65.4830, 71.6880, 72.7770, 79.9720, 72.3960, 74.1430, 78.4000, 70.6160, 74.2490, 73.4220, 62.6980, 31889923.0000, 708573.0000, 150448339.0000, 14131858.0000, 1318683096.0000, 6980412.0000, 1110396331.0000, 223547000.0000, 69453570.0000, 27499638.0000, 6426679.0000, 127467972.0000, 6053193.0000, 23301725.0000, 49044790.0000, 2505559.0000, 3921278.0000, 24821286.0000, 2874127.0000, 47761980.0000, 28901790.0000, 3204897.0000, 169270617.0000, 91077287.0000, 27601038.0000, 4553009.0000, 20378239.0000, 19314747.0000, 23174294.0000, 65068149.0000, 85262356.0000, 4018332.0000, 22211743.0000, 974.5803, 29796.0483, 1391.2538, 1713.7787, 4959.1149, 39724.9787, 2452.2104, 3540.6516, 11605.7145, 4471.0619, 25523.2771, 31656.0681, 4519.4612, 1593.0655, 23348.1397, 47306.9898, 10461.0587, 12451.6558, 3095.7723, 944.0000, 1091.3598, 22316.1929, 2605.9476, 3190.4810, 21654.8319, 47143.1796, 3970.0954, 4184.5481, 28718.2768, 7458.3963, 2441.5764, 3025.3498, 2280.7699
Europe 1952 2.0000, 7.0000, 10.0000, 13.0000, 16.0000, 32.0000, 34.0000, 35.0000, 44.0000, 45.0000, 48.0000, 50.0000, 57.0000, 58.0000, 63.0000, 65.0000, 85.0000, 91.0000, 96.0000, 103.0000, 104.0000, 107.0000, 112.0000, 115.0000, 116.0000, 119.0000, 123.0000, 124.0000, 132.0000, 134.0000, 55.2300, 66.8000, 68.0000, 53.8200, 59.6000, 61.2100, 66.8700, 70.7800, 66.5500, 67.4100, 67.5000, 65.8600, 64.0300, 72.4900, 66.9100, 65.9400, 59.1640, 72.1300, 72.6700, 61.3100, 59.8200, 61.0500, 57.9960, 64.3600, 65.5700, 64.9400, 71.8600, 69.6200, 43.5850, 69.1800, 1282697.0000, 6927772.0000, 8730405.0000, 2791000.0000, 7274900.0000, 3882229.0000, 9125183.0000, 4334000.0000, 4090500.0000, 42459667.0000, 69145952.0000, 7733250.0000, 9504000.0000, 147962.0000, 2952156.0000, 47666000.0000, 413834.0000, 10381988.0000, 3327728.0000, 25730551.0000, 8526050.0000, 16630000.0000, 6860147.0000, 3558137.0000, 1489518.0000, 28549870.0000, 7124673.0000, 4815000.0000, 22235677.0000, 50430000.0000, 1601.0561, 6137.0765, 8343.1051, 973.5332, 2444.2866, 3119.2365, 6876.1403, 9692.3852, 6424.5191, 7029.8093, 7144.1144, 3530.6901, 5263.6738, 7267.6884, 5210.2803, 4931.4042, 2647.5856, 8941.5719, 10095.4217, 4029.3297, 3068.3199, 3144.6132, 3581.4594, 5074.6591, 4215.0417, 3834.0347, 8527.8447, 14734.2327, 1969.1010, 9979.5085
Europe 1957 2.000, 7.000, 10.000, 13.000, 16.000, 32.000, 34.000, 35.000, 44.000, 45.000, 48.000, 50.000, 57.000, 58.000, 63.000, 65.000, 85.000, 91.000, 96.000, 103.000, 104.000, 107.000, 112.000, 115.000, 116.000, 119.000, 123.000, 124.000, 132.000, 134.000, 59.280, 67.480, 69.240, 58.450, 66.610, 64.770, 69.030, 71.810, 67.490, 68.930, 69.100, 67.860, 66.410, 73.470, 68.900, 67.810, 61.448, 72.990, 73.440, 65.770, 61.510, 64.100, 61.685, 67.450, 67.850, 66.660, 72.490, 70.560, 48.079, 70.420, 1476505.000, 6965860.000, 8989111.000, 3076000.000, 7651254.000, 3991242.000, 9513758.000, 4487831.000, 4324000.000, 44310863.000, 71019069.000, 8096218.000, 9839000.000, 165110.000, 2878220.000, 49182000.000, 442829.000, 11026383.000, 3491938.000, 28235346.000, 8817650.000, 17829327.000, 7271135.000, 3844277.000, 1533070.000, 29841614.000, 7363802.000, 5126000.000, 25670939.000, 51430000.000, 1942.284, 8842.598, 9714.961, 1353.989, 3008.671, 4338.232, 8256.344, 11099.659, 7545.415, 8662.835, 10187.827, 4916.300, 6040.180, 9244.001, 5599.078, 6248.656, 3682.260, 11276.193, 11653.973, 4734.253, 3774.572, 3943.370, 4981.091, 6093.263, 5862.277, 4564.802, 9911.878, 17909.490, 2218.754, 11283.178
Europe 1962 2.000, 7.000, 10.000, 13.000, 16.000, 32.000, 34.000, 35.000, 44.000, 45.000, 48.000, 50.000, 57.000, 58.000, 63.000, 65.000, 85.000, 91.000, 96.000, 103.000, 104.000, 107.000, 112.000, 115.000, 116.000, 119.000, 123.000, 124.000, 132.000, 134.000, 64.820, 69.540, 70.250, 61.930, 69.510, 67.130, 69.900, 72.350, 68.750, 70.510, 70.300, 69.510, 67.960, 73.680, 70.290, 69.240, 63.728, 73.230, 73.470, 67.640, 64.390, 66.800, 64.531, 70.330, 69.150, 69.690, 73.370, 71.320, 52.098, 70.760, 1728137.000, 7129864.000, 9218400.000, 3349000.000, 8012946.000, 4076557.000, 9620282.000, 4646899.000, 4491443.000, 47124000.000, 73739117.000, 8448233.000, 10063000.000, 182053.000, 2830000.000, 50843200.000, 474528.000, 11805689.000, 3638919.000, 30329617.000, 9019800.000, 18680721.000, 7616060.000, 4237384.000, 1582962.000, 31158061.000, 7561588.000, 5666000.000, 29788695.000, 53292000.000, 2312.889, 10750.721, 10991.207, 1709.684, 4254.338, 5477.890, 10136.867, 13583.314, 9371.843, 10560.486, 12902.463, 6017.191, 7550.360, 10350.159, 6631.597, 8243.582, 4649.594, 12790.850, 13450.402, 5338.752, 4727.955, 4734.998, 6289.629, 7481.108, 7402.303, 5693.844, 12329.442, 20431.093, 2322.870, 12477.177
Europe 1967 2.000, 7.000, 10.000, 13.000, 16.000, 32.000, 34.000, 35.000, 44.000, 45.000, 48.000, 50.000, 57.000, 58.000, 63.000, 65.000, 85.000, 91.000, 96.000, 103.000, 104.000, 107.000, 112.000, 115.000, 116.000, 119.000, 123.000, 124.000, 132.000, 134.000, 66.220, 70.140, 70.940, 64.790, 70.420, 68.500, 70.380, 72.960, 69.830, 71.550, 70.800, 71.000, 69.500, 73.730, 71.080, 71.060, 67.178, 73.820, 74.080, 69.610, 66.600, 66.800, 66.914, 70.980, 69.180, 71.440, 74.160, 72.770, 54.336, 71.360, 1984060.000, 7376998.000, 9556500.000, 3585000.000, 8310226.000, 4174366.000, 9835109.000, 4838800.000, 4605744.000, 49569000.000, 76368453.000, 8716441.000, 10223422.000, 198676.000, 2900100.000, 52667100.000, 501035.000, 12596822.000, 3786019.000, 31785378.000, 9103000.000, 19284814.000, 7971222.000, 4442238.000, 1646912.000, 32850275.000, 7867931.000, 6063000.000, 33411317.000, 54959000.000, 2760.197, 12834.602, 13149.041, 2172.352, 5577.003, 6960.298, 11399.445, 15937.211, 10921.636, 12999.918, 14745.626, 8513.097, 9326.645, 13319.896, 7655.569, 10022.401, 5907.851, 15363.251, 16361.876, 6557.153, 6361.518, 6470.867, 7991.707, 8412.902, 9405.489, 7993.512, 15258.297, 22966.144, 2826.356, 14142.851
Europe 1972 2.000, 7.000, 10.000, 13.000, 16.000, 32.000, 34.000, 35.000, 44.000, 45.000, 48.000, 50.000, 57.000, 58.000, 63.000, 65.000, 85.000, 91.000, 96.000, 103.000, 104.000, 107.000, 112.000, 115.000, 116.000, 119.000, 123.000, 124.000, 132.000, 134.000, 67.690, 70.630, 71.440, 67.450, 70.900, 69.610, 70.290, 73.470, 70.870, 72.380, 71.000, 72.340, 69.760, 74.460, 71.280, 72.190, 70.636, 73.750, 74.340, 70.850, 69.260, 69.210, 68.700, 70.350, 69.820, 73.060, 74.720, 73.780, 57.005, 72.010, 2263554.000, 7544201.000, 9709100.000, 3819000.000, 8576200.000, 4225310.000, 9862158.000, 4991596.000, 4639657.000, 51732000.000, 78717088.000, 8888628.000, 10394091.000, 209275.000, 3024400.000, 54365564.000, 527678.000, 13329874.000, 3933004.000, 33039545.000, 8970450.000, 20662648.000, 8313288.000, 4593433.000, 1694510.000, 34513161.000, 8122293.000, 6401400.000, 37492953.000, 56079000.000, 3313.422, 16661.626, 16672.144, 2860.170, 6597.494, 9164.090, 13108.454, 18866.207, 14358.876, 16107.192, 18016.180, 12724.830, 10168.656, 15798.064, 9530.773, 12269.274, 7778.414, 18794.746, 18965.056, 8006.507, 9022.247, 8011.414, 10522.067, 9674.168, 12383.486, 10638.751, 17832.025, 27195.113, 3450.696, 15895.116
Europe 1977 2.000, 7.000, 10.000, 13.000, 16.000, 32.000, 34.000, 35.000, 44.000, 45.000, 48.000, 50.000, 57.000, 58.000, 63.000, 65.000, 85.000, 91.000, 96.000, 103.000, 104.000, 107.000, 112.000, 115.000, 116.000, 119.000, 123.000, 124.000, 132.000, 134.000, 68.930, 72.170, 72.800, 69.860, 70.810, 70.640, 70.710, 74.690, 72.520, 73.830, 72.500, 73.680, 69.950, 76.110, 72.030, 73.480, 73.066, 75.240, 75.370, 70.670, 70.410, 69.460, 70.300, 70.450, 70.970, 74.390, 75.440, 75.390, 59.507, 72.760, 2509048.000, 7568430.000, 9821800.000, 4086000.000, 8797022.000, 4318673.000, 10161915.000, 5088419.000, 4738902.000, 53165019.000, 78160773.000, 9308479.000, 10637171.000, 221823.000, 3271900.000, 56059245.000, 560073.000, 13852989.000, 4043205.000, 34621254.000, 9662600.000, 21658597.000, 8686367.000, 4827803.000, 1746919.000, 36439000.000, 8251648.000, 6316424.000, 42404033.000, 56179000.000, 3533.004, 19749.422, 19117.974, 3528.481, 7612.240, 11305.385, 14800.161, 20422.901, 15605.423, 18292.635, 20512.921, 14195.524, 11674.837, 19654.962, 11150.981, 14255.985, 9595.930, 21209.059, 23311.349, 9508.141, 10172.486, 9356.397, 12980.670, 10922.664, 15277.030, 13236.921, 18855.725, 26982.291, 4269.122, 17428.748
Europe 1982 2.000, 7.000, 10.000, 13.000, 16.000, 32.000, 34.000, 35.000, 44.000, 45.000, 48.000, 50.000, 57.000, 58.000, 63.000, 65.000, 85.000, 91.000, 96.000, 103.000, 104.000, 107.000, 112.000, 115.000, 116.000, 119.000, 123.000, 124.000, 132.000, 134.000, 70.420, 73.180, 73.930, 70.690, 71.080, 70.460, 70.960, 74.630, 74.550, 74.890, 73.800, 75.240, 69.390, 76.990, 73.100, 74.980, 74.101, 76.050, 75.970, 71.320, 72.770, 69.660, 70.162, 70.800, 71.063, 76.300, 76.420, 76.210, 61.036, 74.040, 2780097.000, 7574613.000, 9856303.000, 4172693.000, 8892098.000, 4413368.000, 10303704.000, 5117810.000, 4826933.000, 54433565.000, 78335266.000, 9786480.000, 10705535.000, 233997.000, 3480000.000, 56535636.000, 562548.000, 14310401.000, 4114787.000, 36227381.000, 9859650.000, 22356726.000, 9032824.000, 5048043.000, 1861252.000, 37983310.000, 8325260.000, 6468126.000, 47328791.000, 56339704.000, 3630.881, 21597.084, 20979.846, 4126.613, 8224.192, 13221.822, 15377.229, 21688.040, 18533.158, 20293.897, 22031.533, 15268.421, 12545.991, 23269.607, 12618.321, 16537.483, 11222.588, 21399.460, 26298.635, 8451.531, 11753.843, 9605.314, 15181.093, 11348.546, 17866.722, 13926.170, 20667.381, 28397.715, 4241.356, 18232.425
Europe 1987 2.000, 7.000, 10.000, 13.000, 16.000, 32.000, 34.000, 35.000, 44.000, 45.000, 48.000, 50.000, 57.000, 58.000, 63.000, 65.000, 85.000, 91.000, 96.000, 103.000, 104.000, 107.000, 112.000, 115.000, 116.000, 119.000, 123.000, 124.000, 132.000, 134.000, 72.000, 74.940, 75.350, 71.140, 71.340, 71.520, 71.580, 74.800, 74.830, 76.340, 74.847, 76.670, 69.580, 77.230, 74.360, 76.420, 74.865, 76.830, 75.890, 70.980, 74.060, 69.530, 71.218, 71.080, 72.250, 76.900, 77.190, 77.410, 63.108, 75.007, 3075321.000, 7578903.000, 9870200.000, 4338977.000, 8971958.000, 4484310.000, 10311597.000, 5127024.000, 4931729.000, 55630100.000, 77718298.000, 9974490.000, 10612740.000, 244676.000, 3539900.000, 56729703.000, 569473.000, 14665278.000, 4186147.000, 37740710.000, 9915289.000, 22686371.000, 9230783.000, 5199318.000, 1945870.000, 38880702.000, 8421403.000, 6649942.000, 52881328.000, 56981620.000, 3738.933, 23687.826, 22525.563, 4314.115, 8239.855, 13822.584, 16310.443, 25116.176, 21141.012, 22066.442, 24639.186, 16120.528, 12986.480, 26923.206, 13872.867, 19207.235, 11732.510, 23651.324, 31540.975, 9082.351, 13039.309, 9696.273, 15870.879, 12037.268, 18678.535, 15764.983, 23586.929, 30281.705, 5089.044, 21664.788
Europe 1992 2.000, 7.000, 10.000, 13.000, 16.000, 32.000, 34.000, 35.000, 44.000, 45.000, 48.000, 50.000, 57.000, 58.000, 63.000, 65.000, 85.000, 91.000, 96.000, 103.000, 104.000, 107.000, 112.000, 115.000, 116.000, 119.000, 123.000, 124.000, 132.000, 134.000, 71.581, 76.040, 76.460, 72.178, 71.190, 72.527, 72.400, 75.330, 75.700, 77.460, 76.070, 77.030, 69.170, 78.770, 75.467, 77.440, 75.435, 77.420, 77.320, 70.990, 74.860, 69.360, 71.659, 71.380, 73.640, 77.570, 78.160, 78.030, 66.146, 76.420, 3326498.000, 7914969.000, 10045622.000, 4256013.000, 8658506.000, 4494013.000, 10315702.000, 5171393.000, 5041039.000, 57374179.000, 80597764.000, 10325429.000, 10348684.000, 259012.000, 3557761.000, 56840847.000, 621621.000, 15174244.000, 4286357.000, 38370697.000, 9927680.000, 22797027.000, 9826397.000, 5302888.000, 1999210.000, 39549438.000, 8718867.000, 6995447.000, 58179144.000, 57866349.000, 2497.438, 27042.019, 25575.571, 2546.781, 6302.623, 8447.795, 14297.021, 26406.740, 20647.165, 24703.796, 26505.303, 17541.496, 10535.629, 25144.392, 17558.816, 22013.645, 7003.339, 26790.950, 33965.661, 7738.881, 16207.267, 6598.410, 9325.068, 9498.468, 14214.717, 18603.065, 23880.017, 31871.530, 5678.348, 22705.093
Europe 1997 2.000, 7.000, 10.000, 13.000, 16.000, 32.000, 34.000, 35.000, 44.000, 45.000, 48.000, 50.000, 57.000, 58.000, 63.000, 65.000, 85.000, 91.000, 96.000, 103.000, 104.000, 107.000, 112.000, 115.000, 116.000, 119.000, 123.000, 124.000, 132.000, 134.000, 72.950, 77.510, 77.530, 73.244, 70.320, 73.680, 74.010, 76.110, 77.130, 78.640, 77.340, 77.869, 71.040, 78.950, 76.122, 78.820, 75.445, 78.030, 78.320, 72.750, 75.970, 69.720, 72.232, 72.710, 75.130, 78.770, 79.390, 79.370, 68.835, 77.218, 3428038.000, 8069876.000, 10199787.000, 3607000.000, 8066057.000, 4444595.000, 10300707.000, 5283663.000, 5134406.000, 58623428.000, 82011073.000, 10502372.000, 10244684.000, 271192.000, 3667233.000, 57479469.000, 692651.000, 15604464.000, 4405672.000, 38654957.000, 10156415.000, 22562458.000, 10336594.000, 5383010.000, 2011612.000, 39855442.000, 8897619.000, 7193761.000, 63047647.000, 58808266.000, 3193.055, 29095.921, 27561.197, 4766.356, 5970.389, 9875.605, 16048.514, 29804.346, 23723.950, 25889.785, 27788.884, 18747.698, 11712.777, 28061.100, 24521.947, 24675.024, 6465.613, 30246.131, 41283.164, 10159.584, 17641.032, 7346.548, 7914.320, 12126.231, 17161.107, 20445.299, 25266.595, 32135.323, 6601.430, 26074.531
Europe 2002 2.000, 7.000, 10.000, 13.000, 16.000, 32.000, 34.000, 35.000, 44.000, 45.000, 48.000, 50.000, 57.000, 58.000, 63.000, 65.000, 85.000, 91.000, 96.000, 103.000, 104.000, 107.000, 112.000, 115.000, 116.000, 119.000, 123.000, 124.000, 132.000, 134.000, 75.651, 78.980, 78.320, 74.090, 72.140, 74.876, 75.510, 77.180, 78.370, 79.590, 78.670, 78.256, 72.590, 80.500, 77.783, 80.240, 73.981, 78.530, 79.050, 74.670, 77.290, 71.322, 73.213, 73.800, 76.660, 79.780, 80.040, 80.620, 70.845, 78.471, 3508512.000, 8148312.000, 10311970.000, 4165416.000, 7661799.000, 4481020.000, 10256295.000, 5374693.000, 5193039.000, 59925035.000, 82350671.000, 10603863.000, 10083313.000, 288030.000, 3879155.000, 57926999.000, 720230.000, 16122830.000, 4535591.000, 38625976.000, 10433867.000, 22404337.000, 10111559.000, 5410052.000, 2011497.000, 40152517.000, 8954175.000, 7361757.000, 67308928.000, 59912431.000, 4604.212, 32417.608, 30485.884, 6018.975, 7696.778, 11628.389, 17596.210, 32166.500, 28204.591, 28926.032, 30035.802, 22514.255, 14843.936, 31163.202, 34077.049, 27968.098, 6557.194, 33724.758, 44683.975, 12002.239, 19970.908, 7885.360, 7236.075, 13638.778, 20660.019, 24835.472, 29341.631, 34480.958, 6508.086, 29478.999
Europe 2007 2.000, 7.000, 10.000, 13.000, 16.000, 32.000, 34.000, 35.000, 44.000, 45.000, 48.000, 50.000, 57.000, 58.000, 63.000, 65.000, 85.000, 91.000, 96.000, 103.000, 104.000, 107.000, 112.000, 115.000, 116.000, 119.000, 123.000, 124.000, 132.000, 134.000, 76.423, 79.829, 79.441, 74.852, 73.005, 75.748, 76.486, 78.332, 79.313, 80.657, 79.406, 79.483, 73.338, 81.757, 78.885, 80.546, 74.543, 79.762, 80.196, 75.563, 78.098, 72.476, 74.002, 74.663, 77.926, 80.941, 80.884, 81.701, 71.777, 79.425, 3600523.000, 8199783.000, 10392226.000, 4552198.000, 7322858.000, 4493312.000, 10228744.000, 5468120.000, 5238460.000, 61083916.000, 82400996.000, 10706290.000, 9956108.000, 301931.000, 4109086.000, 58147733.000, 684736.000, 16570613.000, 4627926.000, 38518241.000, 10642836.000, 22276056.000, 10150265.000, 5447502.000, 2009245.000, 40448191.000, 9031088.000, 7554661.000, 71158647.000, 60776238.000, 5937.030, 36126.493, 33692.605, 7446.299, 10680.793, 14619.223, 22833.309, 35278.419, 33207.084, 30470.017, 32170.374, 27538.412, 18008.944, 36180.789, 40675.996, 28569.720, 9253.896, 36797.933, 49357.190, 15389.925, 20509.648, 10808.476, 9786.535, 18678.314, 25768.258, 28821.064, 33859.748, 37506.419, 8458.276, 33203.261
Africa 1952 3.0000, 4.0000, 11.0000, 14.0000, 17.0000, 18.0000, 20.0000, 22.0000, 23.0000, 27.0000, 28.0000, 29.0000, 31.0000, 36.0000, 39.0000, 41.0000, 42.0000, 43.0000, 46.0000, 47.0000, 49.0000, 52.0000, 53.0000, 69.0000, 74.0000, 75.0000, 76.0000, 77.0000, 78.0000, 80.0000, 81.0000, 82.0000, 86.0000, 87.0000, 89.0000, 94.0000, 95.0000, 106.0000, 108.0000, 109.0000, 111.0000, 113.0000, 117.0000, 118.0000, 121.0000, 122.0000, 127.0000, 129.0000, 131.0000, 133.0000, 141.0000, 142.0000, 43.0770, 30.0150, 38.2230, 47.6220, 31.9750, 39.0310, 38.5230, 35.4630, 38.0920, 40.7150, 39.1430, 42.1110, 40.4770, 34.8120, 41.8930, 34.4820, 35.9280, 34.0780, 37.0030, 30.0000, 43.1490, 33.6090, 32.5000, 42.2700, 42.1380, 38.4800, 42.7230, 36.6810, 36.2560, 33.6850, 40.5430, 50.9860, 42.8730, 31.2860, 41.7250, 37.4440, 36.3240, 52.7240, 40.0000, 46.4710, 37.2780, 30.3310, 32.9780, 45.0090, 38.6350, 41.4070, 41.2150, 38.5960, 44.6000, 39.9780, 42.0380, 48.4510, 9279525.0000, 4232095.0000, 1738315.0000, 442308.0000, 4469979.0000, 2445618.0000, 5009067.0000, 1291695.0000, 2682462.0000, 153936.0000, 14100005.0000, 854885.0000, 2977019.0000, 63149.0000, 22223309.0000, 216964.0000, 1438760.0000, 20860941.0000, 420702.0000, 284320.0000, 5581001.0000, 2664249.0000, 580653.0000, 6464046.0000, 748747.0000, 863308.0000, 1019729.0000, 4762912.0000, 2917802.0000, 3838168.0000, 1022556.0000, 516556.0000, 9939217.0000, 6446316.0000, 485831.0000, 3379468.0000, 33119096.0000, 257700.0000, 2534927.0000, 60011.0000, 2755589.0000, 2143249.0000, 2526994.0000, 14264935.0000, 8504667.0000, 290243.0000, 8322925.0000, 1219113.0000, 3647735.0000, 5824797.0000, 2672000.0000, 3080907.0000, 2449.0082, 3520.6103, 1062.7522, 851.2411, 543.2552, 339.2965, 1172.6677, 1071.3107, 1178.6659, 1102.9909, 780.5423, 2125.6214, 1388.5947, 2669.5295, 1418.8224, 375.6431, 328.9406, 362.1463, 4293.4765, 485.2307, 911.2989, 510.1965, 299.8503, 853.5409, 298.8462, 575.5730, 2387.5481, 1443.0117, 369.1651, 452.3370, 743.1159, 1967.9557, 1688.2036, 468.5260, 2423.7804, 761.8794, 1077.2819, 2718.8853, 493.3239, 879.5836, 1450.3570, 879.7877, 1135.7498, 4725.2955, 1615.9911, 1148.3766, 716.6501, 859.8087, 1468.4756, 734.7535, 1147.3888, 406.8841
Africa 1957 3.0000, 4.0000, 11.0000, 14.0000, 17.0000, 18.0000, 20.0000, 22.0000, 23.0000, 27.0000, 28.0000, 29.0000, 31.0000, 36.0000, 39.0000, 41.0000, 42.0000, 43.0000, 46.0000, 47.0000, 49.0000, 52.0000, 53.0000, 69.0000, 74.0000, 75.0000, 76.0000, 77.0000, 78.0000, 80.0000, 81.0000, 82.0000, 86.0000, 87.0000, 89.0000, 94.0000, 95.0000, 106.0000, 108.0000, 109.0000, 111.0000, 113.0000, 117.0000, 118.0000, 121.0000, 122.0000, 127.0000, 129.0000, 131.0000, 133.0000, 141.0000, 142.0000, 45.6850, 31.9990, 40.3580, 49.6180, 34.9060, 40.5330, 40.4280, 37.4640, 39.8810, 42.4600, 40.6520, 45.0530, 42.4690, 37.3280, 44.4440, 35.9830, 38.0470, 36.6670, 38.9990, 32.0650, 44.7790, 34.5580, 33.4890, 44.6860, 45.0470, 39.4860, 45.2890, 38.8650, 37.2070, 35.3070, 42.3380, 58.0890, 45.4230, 33.7790, 45.2260, 38.5980, 37.8020, 55.0900, 41.5000, 48.9450, 39.3290, 31.5700, 34.9770, 47.9850, 39.6240, 43.4240, 42.9740, 41.2080, 47.1000, 42.5710, 44.0770, 50.4690, 10270856.0000, 4561361.0000, 1925173.0000, 474639.0000, 4713416.0000, 2667518.0000, 5359923.0000, 1392284.0000, 2894855.0000, 170928.0000, 15577932.0000, 940458.0000, 3300000.0000, 71851.0000, 25009741.0000, 232922.0000, 1542611.0000, 22815614.0000, 434904.0000, 323150.0000, 6391288.0000, 2876726.0000, 601095.0000, 7454779.0000, 813338.0000, 975950.0000, 1201578.0000, 5181679.0000, 3221238.0000, 4241884.0000, 1076852.0000, 609816.0000, 11406350.0000, 7038035.0000, 548080.0000, 3692184.0000, 37173340.0000, 308700.0000, 2822082.0000, 61325.0000, 3054547.0000, 2295678.0000, 2780415.0000, 16151549.0000, 9753392.0000, 326741.0000, 9452826.0000, 1357445.0000, 3950849.0000, 6675501.0000, 3016000.0000, 3646340.0000, 3013.9760, 3827.9405, 959.6011, 918.2325, 617.1835, 379.5646, 1313.0481, 1190.8443, 1308.4956, 1211.1485, 905.8602, 2315.0566, 1500.8959, 2864.9691, 1458.9153, 426.0964, 344.1619, 378.9042, 4976.1981, 520.9267, 1043.5615, 576.2670, 431.7905, 944.4383, 335.9971, 620.9700, 3448.2844, 1589.2027, 416.3698, 490.3822, 846.1203, 2034.0380, 1642.0023, 495.5868, 2621.4481, 835.5234, 1100.5926, 2769.4518, 540.2894, 860.7369, 1567.6530, 1004.4844, 1258.1474, 5487.1042, 1770.3371, 1244.7084, 698.5356, 925.9083, 1395.2325, 774.3711, 1311.9568, 518.7643
Africa 1962 3.0000, 4.0000, 11.0000, 14.0000, 17.0000, 18.0000, 20.0000, 22.0000, 23.0000, 27.0000, 28.0000, 29.0000, 31.0000, 36.0000, 39.0000, 41.0000, 42.0000, 43.0000, 46.0000, 47.0000, 49.0000, 52.0000, 53.0000, 69.0000, 74.0000, 75.0000, 76.0000, 77.0000, 78.0000, 80.0000, 81.0000, 82.0000, 86.0000, 87.0000, 89.0000, 94.0000, 95.0000, 106.0000, 108.0000, 109.0000, 111.0000, 113.0000, 117.0000, 118.0000, 121.0000, 122.0000, 127.0000, 129.0000, 131.0000, 133.0000, 141.0000, 142.0000, 48.3030, 34.0000, 42.6180, 51.5200, 37.8140, 42.0450, 42.6430, 39.4750, 41.7160, 44.4670, 42.1220, 48.4350, 44.9300, 39.6930, 46.9920, 37.4850, 40.1580, 40.0590, 40.4890, 33.8960, 46.4520, 35.7530, 34.4880, 47.9490, 47.7470, 40.5020, 47.8080, 40.8480, 38.4100, 36.9360, 44.2480, 60.2460, 47.9240, 36.1610, 48.3860, 39.4870, 39.3600, 57.6660, 43.0000, 51.8930, 41.4540, 32.7670, 36.9810, 49.9510, 40.8700, 44.9920, 44.2460, 43.9220, 49.5790, 45.3440, 46.0230, 52.3580, 11000948.0000, 4826015.0000, 2151895.0000, 512764.0000, 4919632.0000, 2961915.0000, 5793633.0000, 1523478.0000, 3150417.0000, 191689.0000, 17486434.0000, 1047924.0000, 3832408.0000, 89898.0000, 28173309.0000, 249220.0000, 1666618.0000, 25145372.0000, 455661.0000, 374020.0000, 7355248.0000, 3140003.0000, 627820.0000, 8678557.0000, 893143.0000, 1112796.0000, 1441863.0000, 5703324.0000, 3628608.0000, 4690372.0000, 1146757.0000, 701016.0000, 13056604.0000, 7788944.0000, 621392.0000, 4076008.0000, 41871351.0000, 358900.0000, 3051242.0000, 65345.0000, 3430243.0000, 2467895.0000, 3080153.0000, 18356657.0000, 11183227.0000, 370006.0000, 10863958.0000, 1528098.0000, 4286552.0000, 7688797.0000, 3421000.0000, 4277736.0000, 2550.8169, 4269.2767, 949.4991, 983.6540, 722.5120, 355.2032, 1399.6074, 1193.0688, 1389.8176, 1406.6483, 896.3146, 2464.7832, 1728.8694, 3020.9893, 1693.3359, 582.8420, 380.9958, 419.4564, 6631.4592, 599.6503, 1190.0411, 686.3737, 522.0344, 896.9664, 411.8006, 634.1952, 6757.0308, 1643.3871, 427.9011, 496.1743, 1055.8960, 2529.0675, 1566.3535, 556.6864, 3173.2156, 997.7661, 1150.9275, 3173.7233, 597.4731, 1071.5511, 1654.9887, 1116.6399, 1369.4883, 5768.7297, 1959.5938, 1856.1821, 722.0038, 1067.5348, 1660.3032, 767.2717, 1452.7258, 527.2722
Africa 1967 3.0000, 4.0000, 11.0000, 14.0000, 17.0000, 18.0000, 20.0000, 22.0000, 23.0000, 27.0000, 28.0000, 29.0000, 31.0000, 36.0000, 39.0000, 41.0000, 42.0000, 43.0000, 46.0000, 47.0000, 49.0000, 52.0000, 53.0000, 69.0000, 74.0000, 75.0000, 76.0000, 77.0000, 78.0000, 80.0000, 81.0000, 82.0000, 86.0000, 87.0000, 89.0000, 94.0000, 95.0000, 106.0000, 108.0000, 109.0000, 111.0000, 113.0000, 117.0000, 118.0000, 121.0000, 122.0000, 127.0000, 129.0000, 131.0000, 133.0000, 141.0000, 142.0000, 51.4070, 35.9850, 44.8850, 53.2980, 40.6970, 43.5480, 44.7990, 41.4780, 43.6010, 46.4720, 44.0560, 52.0400, 47.3500, 42.0740, 49.2930, 38.9870, 42.1890, 42.1150, 44.5980, 35.8570, 48.0720, 37.1970, 35.4920, 50.6540, 48.4920, 41.5360, 50.2270, 42.8810, 39.4870, 38.4870, 46.2890, 61.5570, 50.3350, 38.1130, 51.1590, 40.1180, 41.0400, 60.5420, 44.1000, 54.4250, 43.5630, 34.1130, 38.9770, 51.9270, 42.8580, 46.6330, 45.7570, 46.7690, 52.0530, 48.0510, 47.7680, 53.9950, 12760499.0000, 5247469.0000, 2427334.0000, 553541.0000, 5127935.0000, 3330989.0000, 6335506.0000, 1733638.0000, 3495967.0000, 217378.0000, 19941073.0000, 1179760.0000, 4744870.0000, 127617.0000, 31681188.0000, 259864.0000, 1820319.0000, 27860297.0000, 489004.0000, 439593.0000, 8490213.0000, 3451418.0000, 601287.0000, 10191512.0000, 996380.0000, 1279406.0000, 1759224.0000, 6334556.0000, 4147252.0000, 5212416.0000, 1230542.0000, 789309.0000, 14770296.0000, 8680909.0000, 706640.0000, 4534062.0000, 47287752.0000, 414024.0000, 3451079.0000, 70787.0000, 3965841.0000, 2662190.0000, 3428839.0000, 20997321.0000, 12716129.0000, 420690.0000, 12607312.0000, 1735550.0000, 4786986.0000, 8900294.0000, 3900000.0000, 4995432.0000, 3246.9918, 5522.7764, 1035.8314, 1214.7093, 794.8266, 412.9775, 1508.4531, 1136.0566, 1196.8106, 1876.0296, 861.5932, 2677.9396, 2052.0505, 3020.0505, 1814.8807, 915.5960, 468.7950, 516.1186, 8358.7620, 734.7829, 1125.6972, 708.7595, 715.5806, 1056.7365, 498.6390, 713.6036, 18772.7517, 1634.0473, 495.5148, 545.0099, 1421.1452, 2475.3876, 1711.0448, 566.6692, 3793.6948, 1054.3849, 1014.5141, 4021.1757, 510.9637, 1384.8406, 1612.4046, 1206.0435, 1284.7332, 7114.4780, 1687.9976, 2613.1017, 848.2187, 1477.5968, 1932.3602, 908.9185, 1777.0773, 569.7951
Africa 1972 3.0000, 4.0000, 11.0000, 14.0000, 17.0000, 18.0000, 20.0000, 22.0000, 23.0000, 27.0000, 28.0000, 29.0000, 31.0000, 36.0000, 39.0000, 41.0000, 42.0000, 43.0000, 46.0000, 47.0000, 49.0000, 52.0000, 53.0000, 69.0000, 74.0000, 75.0000, 76.0000, 77.0000, 78.0000, 80.0000, 81.0000, 82.0000, 86.0000, 87.0000, 89.0000, 94.0000, 95.0000, 106.0000, 108.0000, 109.0000, 111.0000, 113.0000, 117.0000, 118.0000, 121.0000, 122.0000, 127.0000, 129.0000, 131.0000, 133.0000, 141.0000, 142.0000, 54.5180, 37.9280, 47.0140, 56.0240, 43.5910, 44.0570, 47.0490, 43.4570, 45.5690, 48.9440, 45.9890, 54.9070, 49.8010, 44.3660, 51.1370, 40.5160, 44.1420, 43.5150, 48.6900, 38.3080, 49.8750, 38.8420, 36.4860, 53.5590, 49.7670, 42.6140, 52.7730, 44.8510, 41.7660, 39.9770, 48.4370, 62.9440, 52.8620, 40.3280, 53.8670, 40.5460, 42.8210, 64.2740, 44.6000, 56.4800, 45.8150, 35.4000, 40.9730, 53.6960, 45.0830, 49.5520, 47.6200, 49.7590, 55.6020, 51.0160, 50.1070, 55.6350, 14760787.0000, 5894858.0000, 2761407.0000, 619351.0000, 5433886.0000, 3529983.0000, 7021028.0000, 1927260.0000, 3899068.0000, 250027.0000, 23007669.0000, 1340458.0000, 6071696.0000, 178848.0000, 34807417.0000, 277603.0000, 2260187.0000, 30770372.0000, 537977.0000, 517101.0000, 9354120.0000, 3811387.0000, 625361.0000, 12044785.0000, 1116779.0000, 1482628.0000, 2183877.0000, 7082430.0000, 4730997.0000, 5828158.0000, 1332786.0000, 851334.0000, 16660670.0000, 9809596.0000, 821782.0000, 5060262.0000, 53740085.0000, 461633.0000, 3992121.0000, 76595.0000, 4588696.0000, 2879013.0000, 3840161.0000, 23935810.0000, 14597019.0000, 480105.0000, 14706593.0000, 2056351.0000, 5303507.0000, 10190285.0000, 4506497.0000, 5861135.0000, 4182.6638, 5473.2880, 1085.7969, 2263.6111, 854.7360, 464.0995, 1684.1465, 1070.0133, 1104.1040, 1937.5777, 904.8961, 3213.1527, 2378.2011, 3694.2124, 2024.0081, 672.4123, 514.3242, 566.2439, 11401.9484, 756.0868, 1178.2237, 741.6662, 820.2246, 1222.3600, 496.5816, 803.0055, 21011.4972, 1748.5630, 584.6220, 581.3689, 1586.8518, 2575.4842, 1930.1950, 724.9178, 3746.0809, 954.2092, 1698.3888, 5047.6586, 590.5807, 1532.9853, 1597.7121, 1353.7598, 1254.5761, 7765.9626, 1659.6528, 3364.8366, 915.9851, 1649.6602, 2753.2860, 950.7359, 1773.4983, 799.3622
Africa 1977 3.0000, 4.0000, 11.0000, 14.0000, 17.0000, 18.0000, 20.0000, 22.0000, 23.0000, 27.0000, 28.0000, 29.0000, 31.0000, 36.0000, 39.0000, 41.0000, 42.0000, 43.0000, 46.0000, 47.0000, 49.0000, 52.0000, 53.0000, 69.0000, 74.0000, 75.0000, 76.0000, 77.0000, 78.0000, 80.0000, 81.0000, 82.0000, 86.0000, 87.0000, 89.0000, 94.0000, 95.0000, 106.0000, 108.0000, 109.0000, 111.0000, 113.0000, 117.0000, 118.0000, 121.0000, 122.0000, 127.0000, 129.0000, 131.0000, 133.0000, 141.0000, 142.0000, 58.0140, 39.4830, 49.1900, 59.3190, 46.1370, 45.9100, 49.3550, 46.7750, 47.3830, 50.9390, 47.8040, 55.6250, 52.3740, 46.5190, 53.3190, 42.0240, 44.5350, 44.5100, 52.7900, 41.8420, 51.7560, 40.7620, 37.4650, 56.1550, 52.2080, 43.7640, 57.4420, 46.8810, 43.7670, 41.7140, 50.8520, 64.9300, 55.7300, 42.4950, 56.4370, 41.2910, 44.5140, 67.0640, 45.0000, 58.5500, 48.8790, 36.7880, 41.9740, 55.5270, 47.8000, 52.5370, 49.9190, 52.8870, 59.8370, 50.3500, 51.3860, 57.6740, 17152804.0000, 6162675.0000, 3168267.0000, 781472.0000, 5889574.0000, 3834415.0000, 7959865.0000, 2167533.0000, 4388260.0000, 304739.0000, 26480870.0000, 1536769.0000, 7459574.0000, 228694.0000, 38783863.0000, 192675.0000, 2512642.0000, 34617799.0000, 706367.0000, 608274.0000, 10538093.0000, 4227026.0000, 745228.0000, 14500404.0000, 1251524.0000, 1703617.0000, 2721783.0000, 8007166.0000, 5637246.0000, 6491649.0000, 1456688.0000, 913025.0000, 18396941.0000, 11127868.0000, 977026.0000, 5682086.0000, 62209173.0000, 492095.0000, 4657072.0000, 86796.0000, 5260855.0000, 3140897.0000, 4353666.0000, 27129932.0000, 17104986.0000, 551425.0000, 17129565.0000, 2308582.0000, 6005061.0000, 11457758.0000, 5216550.0000, 6642107.0000, 4910.4168, 3008.6474, 1029.1613, 3214.8578, 743.3870, 556.1033, 1783.4329, 1109.3743, 1133.9850, 1172.6030, 795.7573, 3259.1790, 2517.7365, 3081.7610, 2785.4936, 958.5668, 505.7538, 556.8084, 21745.5733, 884.7553, 993.2240, 874.6859, 764.7260, 1267.6132, 745.3695, 640.3224, 21951.2118, 1544.2286, 663.2237, 686.3953, 1497.4922, 3710.9830, 2370.6200, 502.3197, 3876.4860, 808.8971, 1981.9518, 4319.8041, 670.0806, 1737.5617, 1561.7691, 1348.2852, 1450.9925, 8028.6514, 2202.9884, 3781.4106, 962.4923, 1532.7770, 3120.8768, 843.7331, 1588.6883, 685.5877
Africa 1982 3.0000, 4.0000, 11.0000, 14.0000, 17.0000, 18.0000, 20.0000, 22.0000, 23.0000, 27.0000, 28.0000, 29.0000, 31.0000, 36.0000, 39.0000, 41.0000, 42.0000, 43.0000, 46.0000, 47.0000, 49.0000, 52.0000, 53.0000, 69.0000, 74.0000, 75.0000, 76.0000, 77.0000, 78.0000, 80.0000, 81.0000, 82.0000, 86.0000, 87.0000, 89.0000, 94.0000, 95.0000, 106.0000, 108.0000, 109.0000, 111.0000, 113.0000, 117.0000, 118.0000, 121.0000, 122.0000, 127.0000, 129.0000, 131.0000, 133.0000, 141.0000, 142.0000, 61.3680, 39.9420, 50.9040, 61.4840, 48.1220, 47.4710, 52.9610, 48.2950, 49.5170, 52.9330, 47.7840, 56.6950, 53.9830, 48.8120, 56.0060, 43.6620, 43.8900, 44.9160, 56.5640, 45.5800, 53.7440, 42.8910, 39.3270, 58.7660, 55.0780, 44.8520, 62.1550, 48.9690, 45.6420, 43.9160, 53.5990, 66.7110, 59.6500, 42.7950, 58.9680, 42.5980, 45.8260, 69.8850, 46.2180, 60.3510, 52.3790, 38.4450, 42.9550, 58.1610, 50.3380, 55.5610, 50.6080, 55.4710, 64.0480, 49.8490, 51.8210, 60.3630, 20033753.0000, 7016384.0000, 3641603.0000, 970347.0000, 6634596.0000, 4580410.0000, 9250831.0000, 2476971.0000, 4875118.0000, 348643.0000, 30646495.0000, 1774735.0000, 9025951.0000, 305991.0000, 45681811.0000, 285483.0000, 2637297.0000, 38111756.0000, 753874.0000, 715523.0000, 11400338.0000, 4710497.0000, 825987.0000, 17661452.0000, 1411807.0000, 1956875.0000, 3344074.0000, 9171477.0000, 6502825.0000, 6998256.0000, 1622136.0000, 992040.0000, 20198730.0000, 12587223.0000, 1099010.0000, 6437188.0000, 73039376.0000, 517810.0000, 5507565.0000, 98593.0000, 6147783.0000, 3464522.0000, 5828892.0000, 31140029.0000, 20367053.0000, 649901.0000, 19844382.0000, 2644765.0000, 6734098.0000, 12939400.0000, 6100407.0000, 7636524.0000, 5745.1602, 2756.9537, 1277.8976, 4551.1421, 807.1986, 559.6032, 2367.9833, 956.7530, 797.9081, 1267.1001, 673.7478, 4879.5075, 2602.7102, 2879.4681, 3503.7296, 927.8253, 524.8758, 577.8607, 15113.3619, 835.8096, 876.0326, 857.2504, 838.1240, 1348.2258, 797.2631, 572.1996, 17364.2754, 1302.8787, 632.8039, 618.0141, 1481.1502, 3688.0377, 2702.6204, 462.2114, 4191.1005, 909.7221, 1576.9738, 5267.2194, 881.5706, 1890.2181, 1518.4800, 1465.0108, 1176.8070, 8568.2662, 1895.5441, 3895.3840, 874.2426, 1344.5780, 3560.2332, 682.2662, 1408.6786, 788.8550
Africa 1987 3.0000, 4.0000, 11.0000, 14.0000, 17.0000, 18.0000, 20.0000, 22.0000, 23.0000, 27.0000, 28.0000, 29.0000, 31.0000, 36.0000, 39.0000, 41.0000, 42.0000, 43.0000, 46.0000, 47.0000, 49.0000, 52.0000, 53.0000, 69.0000, 74.0000, 75.0000, 76.0000, 77.0000, 78.0000, 80.0000, 81.0000, 82.0000, 86.0000, 87.0000, 89.0000, 94.0000, 95.0000, 106.0000, 108.0000, 109.0000, 111.0000, 113.0000, 117.0000, 118.0000, 121.0000, 122.0000, 127.0000, 129.0000, 131.0000, 133.0000, 141.0000, 142.0000, 65.7990, 39.9060, 52.3370, 63.6220, 49.5570, 48.2110, 54.9850, 50.4850, 51.0510, 54.9260, 47.4120, 57.4700, 54.6550, 50.0400, 59.7970, 45.6640, 46.4530, 46.6840, 60.1900, 49.2650, 55.7290, 45.5520, 41.2450, 59.3390, 57.1800, 46.0270, 66.2340, 49.3500, 47.4570, 46.3640, 56.1450, 68.7400, 62.6770, 42.8610, 60.8350, 44.5550, 46.8860, 71.9130, 44.0200, 61.7280, 55.7690, 40.0060, 44.5010, 60.8340, 51.7440, 57.6780, 51.5350, 56.9410, 66.8940, 51.5090, 50.8210, 62.3510, 23254956.0000, 7874230.0000, 4243788.0000, 1151184.0000, 7586551.0000, 5126023.0000, 10780667.0000, 2840009.0000, 5498955.0000, 395114.0000, 35481645.0000, 2064095.0000, 10761098.0000, 311025.0000, 52799062.0000, 341244.0000, 2915959.0000, 42999530.0000, 880397.0000, 848406.0000, 14168101.0000, 5650262.0000, 927524.0000, 21198082.0000, 1599200.0000, 2269414.0000, 3799845.0000, 10568642.0000, 7824747.0000, 7634008.0000, 1841240.0000, 1042663.0000, 22987397.0000, 12891952.0000, 1278184.0000, 7332638.0000, 81551520.0000, 562035.0000, 6349365.0000, 110812.0000, 7171347.0000, 3868905.0000, 6921858.0000, 35933379.0000, 24725960.0000, 779348.0000, 23040630.0000, 3154264.0000, 7724976.0000, 15283050.0000, 7272406.0000, 9216418.0000, 5681.3585, 2430.2083, 1225.8560, 6205.8839, 912.0631, 621.8188, 2602.6642, 844.8764, 952.3861, 1315.9808, 672.7748, 4201.1949, 2156.9561, 2880.1026, 3885.4607, 966.8968, 521.1341, 573.7413, 11864.4084, 611.6589, 847.0061, 805.5725, 736.4154, 1361.9369, 773.9932, 506.1139, 11770.5898, 1155.4419, 635.5174, 684.1716, 1421.6036, 4783.5869, 2755.0470, 389.8762, 3693.7313, 668.3000, 1385.0296, 5303.3775, 847.9912, 1516.5255, 1441.7207, 1294.4478, 1093.2450, 7825.8234, 1507.8192, 3984.8398, 831.8221, 1202.2014, 3810.4193, 617.7244, 1213.3151, 706.1573
Africa 1992 3.0000, 4.0000, 11.0000, 14.0000, 17.0000, 18.0000, 20.0000, 22.0000, 23.0000, 27.0000, 28.0000, 29.0000, 31.0000, 36.0000, 39.0000, 41.0000, 42.0000, 43.0000, 46.0000, 47.0000, 49.0000, 52.0000, 53.0000, 69.0000, 74.0000, 75.0000, 76.0000, 77.0000, 78.0000, 80.0000, 81.0000, 82.0000, 86.0000, 87.0000, 89.0000, 94.0000, 95.0000, 106.0000, 108.0000, 109.0000, 111.0000, 113.0000, 117.0000, 118.0000, 121.0000, 122.0000, 127.0000, 129.0000, 131.0000, 133.0000, 141.0000, 142.0000, 67.7440, 40.6470, 53.9190, 62.7450, 50.2600, 44.7360, 54.3140, 49.3960, 51.7240, 57.9390, 45.5480, 56.4330, 52.0440, 51.6040, 63.6740, 47.5450, 49.9910, 48.0910, 61.3660, 52.6440, 57.5010, 48.5760, 43.2660, 59.2850, 59.6850, 40.8020, 68.7550, 52.2140, 49.4200, 48.3880, 58.3330, 69.7450, 65.3930, 44.2840, 61.9990, 47.3910, 47.4720, 73.6150, 23.5990, 62.7420, 58.1960, 38.3330, 39.6580, 61.8880, 53.5560, 58.4740, 50.4400, 58.0610, 70.0010, 48.8250, 46.1000, 60.3770, 26298373.0000, 8735988.0000, 4981671.0000, 1342614.0000, 8878303.0000, 5809236.0000, 12467171.0000, 3265124.0000, 6429417.0000, 454429.0000, 41672143.0000, 2409073.0000, 12772596.0000, 384156.0000, 59402198.0000, 387838.0000, 3668440.0000, 52088559.0000, 985739.0000, 1025384.0000, 16278738.0000, 6990574.0000, 1050938.0000, 25020539.0000, 1803195.0000, 1912974.0000, 4364501.0000, 12210395.0000, 10014249.0000, 8416215.0000, 2119465.0000, 1096202.0000, 25798239.0000, 13160731.0000, 1554253.0000, 8392818.0000, 93364244.0000, 622191.0000, 7290203.0000, 125911.0000, 8307920.0000, 4260884.0000, 6099799.0000, 39964159.0000, 28227588.0000, 962344.0000, 26605473.0000, 3747553.0000, 8523077.0000, 18252190.0000, 8381163.0000, 10704340.0000, 5023.2166, 2627.8457, 1191.2077, 7954.1116, 931.7528, 631.6999, 1793.1633, 747.9055, 1058.0643, 1246.9074, 457.7192, 4016.2395, 1648.0738, 2377.1562, 3794.7552, 1132.0550, 582.8585, 421.3535, 13522.1575, 665.6244, 925.0602, 794.3484, 745.5399, 1341.9217, 977.4863, 636.6229, 9640.1385, 1040.6762, 563.2000, 739.0144, 1361.3698, 6058.2538, 2948.0473, 410.8968, 3804.5380, 581.1827, 1619.8482, 6101.2558, 737.0686, 1428.7778, 1367.8994, 1068.6963, 926.9603, 7225.0693, 1492.1970, 3553.0224, 825.6825, 1034.2989, 4332.7202, 644.1708, 1210.8846, 693.4208
Africa 1997 3.0000, 4.0000, 11.0000, 14.0000, 17.0000, 18.0000, 20.0000, 22.0000, 23.0000, 27.0000, 28.0000, 29.0000, 31.0000, 36.0000, 39.0000, 41.0000, 42.0000, 43.0000, 46.0000, 47.0000, 49.0000, 52.0000, 53.0000, 69.0000, 74.0000, 75.0000, 76.0000, 77.0000, 78.0000, 80.0000, 81.0000, 82.0000, 86.0000, 87.0000, 89.0000, 94.0000, 95.0000, 106.0000, 108.0000, 109.0000, 111.0000, 113.0000, 117.0000, 118.0000, 121.0000, 122.0000, 127.0000, 129.0000, 131.0000, 133.0000, 141.0000, 142.0000, 69.1520, 40.9630, 54.7770, 52.5560, 50.3240, 45.3260, 52.1990, 46.0660, 51.5730, 60.6600, 42.5870, 52.9620, 47.9910, 53.1570, 67.2170, 48.2450, 53.3780, 49.4020, 60.4610, 55.8610, 58.5560, 51.4550, 44.8730, 54.4070, 55.5580, 42.2210, 71.5550, 54.9780, 47.4950, 49.9030, 60.4300, 70.7360, 67.6600, 46.3440, 58.9090, 51.3130, 47.4640, 74.7720, 36.0870, 63.3060, 60.1870, 39.8970, 43.7950, 60.2360, 55.3730, 54.2890, 48.4660, 58.3900, 71.9730, 44.5780, 40.2380, 46.8090, 29072015.0000, 9875024.0000, 6066080.0000, 1536536.0000, 10352843.0000, 6121610.0000, 14195809.0000, 3696513.0000, 7562011.0000, 527982.0000, 47798986.0000, 2800947.0000, 14625967.0000, 417908.0000, 66134291.0000, 439971.0000, 4058319.0000, 59861301.0000, 1126189.0000, 1235767.0000, 18418288.0000, 8048834.0000, 1193708.0000, 28263827.0000, 1982823.0000, 2200725.0000, 4759670.0000, 14165114.0000, 10419991.0000, 9384984.0000, 2444741.0000, 1149818.0000, 28529501.0000, 16603334.0000, 1774766.0000, 9666252.0000, 106207839.0000, 684810.0000, 7212583.0000, 145608.0000, 9535314.0000, 4578212.0000, 6633514.0000, 42835005.0000, 32160729.0000, 1054486.0000, 30686889.0000, 4320890.0000, 9231669.0000, 21210254.0000, 9417789.0000, 11404948.0000, 4797.2951, 2277.1409, 1232.9753, 8647.1423, 946.2950, 463.1151, 1694.3375, 740.5063, 1004.9614, 1173.6182, 312.1884, 3484.1644, 1786.2654, 1895.0170, 4173.1818, 2814.4808, 913.4708, 515.8894, 14722.8419, 653.7302, 1005.2458, 869.4498, 796.6645, 1360.4850, 1186.1480, 609.1740, 9467.4461, 986.2959, 692.2758, 790.2580, 1483.1361, 7425.7053, 2982.1019, 472.3461, 3899.5243, 580.3052, 1624.9413, 6071.9414, 589.9445, 1339.0760, 1392.3683, 574.6482, 930.5964, 7479.1882, 1632.2108, 3876.7685, 789.1862, 982.2869, 4876.7986, 816.5591, 1071.3538, 792.4500
Africa 2002 3.0000, 4.0000, 11.0000, 14.0000, 17.0000, 18.0000, 20.0000, 22.0000, 23.0000, 27.0000, 28.0000, 29.0000, 31.0000, 36.0000, 39.0000, 41.0000, 42.0000, 43.0000, 46.0000, 47.0000, 49.0000, 52.0000, 53.0000, 69.0000, 74.0000, 75.0000, 76.0000, 77.0000, 78.0000, 80.0000, 81.0000, 82.0000, 86.0000, 87.0000, 89.0000, 94.0000, 95.0000, 106.0000, 108.0000, 109.0000, 111.0000, 113.0000, 117.0000, 118.0000, 121.0000, 122.0000, 127.0000, 129.0000, 131.0000, 133.0000, 141.0000, 142.0000, 70.9940, 41.0030, 54.4060, 46.6340, 50.6500, 47.3600, 49.8560, 43.3080, 50.5250, 62.9740, 44.9660, 52.9700, 46.8320, 53.3730, 69.8060, 49.3480, 55.2400, 50.7250, 56.7610, 58.0410, 58.4530, 53.6760, 45.5040, 50.9920, 44.5930, 43.7530, 72.7370, 57.2860, 45.0090, 51.8180, 62.2470, 71.9540, 69.6150, 44.0260, 51.4790, 54.4960, 46.6080, 75.7440, 43.4130, 64.3370, 61.6000, 41.0120, 45.9360, 53.3650, 56.3690, 43.8690, 49.6510, 57.5610, 73.0420, 47.8130, 39.1930, 39.9890, 31287142.0000, 10866106.0000, 7026113.0000, 1630347.0000, 12251209.0000, 7021078.0000, 15929988.0000, 4048013.0000, 8835739.0000, 614382.0000, 55379852.0000, 3328795.0000, 16252726.0000, 447416.0000, 73312559.0000, 495627.0000, 4414865.0000, 67946797.0000, 1299304.0000, 1457766.0000, 20550751.0000, 8807818.0000, 1332459.0000, 31386842.0000, 2046772.0000, 2814651.0000, 5368585.0000, 16473477.0000, 11824495.0000, 10580176.0000, 2828858.0000, 1200206.0000, 31167783.0000, 18473780.0000, 1972153.0000, 11140655.0000, 119901274.0000, 743981.0000, 7852401.0000, 170372.0000, 10870037.0000, 5359092.0000, 7753310.0000, 44433622.0000, 37090298.0000, 1130269.0000, 34593779.0000, 4977378.0000, 9770575.0000, 24739869.0000, 10595811.0000, 11926563.0000, 5288.0404, 2773.2873, 1372.8779, 11003.6051, 1037.6452, 446.4035, 1934.0114, 738.6906, 1156.1819, 1075.8116, 241.1659, 3484.0620, 1648.8008, 1908.2609, 4754.6044, 7703.4959, 765.3500, 530.0535, 12521.7139, 660.5856, 1111.9846, 945.5836, 575.7047, 1287.5147, 1275.1846, 531.4824, 9534.6775, 894.6371, 665.4231, 951.4098, 1579.0195, 9021.8159, 3258.4956, 633.6179, 4072.3248, 601.0745, 1615.2864, 6316.1652, 785.6538, 1353.0924, 1519.6353, 699.4897, 882.0818, 7710.9464, 1993.3983, 4128.1169, 899.0742, 886.2206, 5722.8957, 927.7210, 1071.6139, 672.0386
Africa 2007 3.0000, 4.0000, 11.0000, 14.0000, 17.0000, 18.0000, 20.0000, 22.0000, 23.0000, 27.0000, 28.0000, 29.0000, 31.0000, 36.0000, 39.0000, 41.0000, 42.0000, 43.0000, 46.0000, 47.0000, 49.0000, 52.0000, 53.0000, 69.0000, 74.0000, 75.0000, 76.0000, 77.0000, 78.0000, 80.0000, 81.0000, 82.0000, 86.0000, 87.0000, 89.0000, 94.0000, 95.0000, 106.0000, 108.0000, 109.0000, 111.0000, 113.0000, 117.0000, 118.0000, 121.0000, 122.0000, 127.0000, 129.0000, 131.0000, 133.0000, 141.0000, 142.0000, 72.3010, 42.7310, 56.7280, 50.7280, 52.2950, 49.5800, 50.4300, 44.7410, 50.6510, 65.1520, 46.4620, 55.3220, 48.3280, 54.7910, 71.3380, 51.5790, 58.0400, 52.9470, 56.7350, 59.4480, 60.0220, 56.0070, 46.3880, 54.1100, 42.5920, 45.6780, 73.9520, 59.4430, 48.3030, 54.4670, 64.1640, 72.8010, 71.1640, 42.0820, 52.9060, 56.8670, 46.8590, 76.4420, 46.2420, 65.5280, 63.0620, 42.5680, 48.1590, 49.3390, 58.5560, 39.6130, 52.5170, 58.4200, 73.9230, 51.5420, 42.3840, 43.4870, 33333216.0000, 12420476.0000, 8078314.0000, 1639131.0000, 14326203.0000, 8390505.0000, 17696293.0000, 4369038.0000, 10238807.0000, 710960.0000, 64606759.0000, 3800610.0000, 18013409.0000, 496374.0000, 80264543.0000, 551201.0000, 4906585.0000, 76511887.0000, 1454867.0000, 1688359.0000, 22873338.0000, 9947814.0000, 1472041.0000, 35610177.0000, 2012649.0000, 3193942.0000, 6036914.0000, 19167654.0000, 13327079.0000, 12031795.0000, 3270065.0000, 1250882.0000, 33757175.0000, 19951656.0000, 2055080.0000, 12894865.0000, 135031164.0000, 798094.0000, 8860588.0000, 199579.0000, 12267493.0000, 6144562.0000, 9118773.0000, 43997828.0000, 42292929.0000, 1133066.0000, 38139640.0000, 5701579.0000, 10276158.0000, 29170398.0000, 11746035.0000, 12311143.0000, 6223.3675, 4797.2313, 1441.2849, 12569.8518, 1217.0330, 430.0707, 2042.0952, 706.0165, 1704.0637, 986.1479, 277.5519, 3632.5578, 1544.7501, 2082.4816, 5581.1810, 12154.0897, 641.3695, 690.8056, 13206.4845, 752.7497, 1327.6089, 942.6542, 579.2317, 1463.2493, 1569.3314, 414.5073, 12057.4993, 1044.7701, 759.3499, 1042.5816, 1803.1515, 10956.9911, 3820.1752, 823.6856, 4811.0604, 619.6769, 2013.9773, 7670.1226, 863.0885, 1598.4351, 1712.4721, 862.5408, 926.1411, 9269.6578, 2602.3950, 4513.4806, 1107.4822, 882.9699, 7092.9230, 1056.3801, 1271.2116, 469.7093
Americas 1952 5.000, 12.000, 15.000, 21.000, 24.000, 26.000, 30.000, 33.000, 37.000, 38.000, 40.000, 51.000, 54.000, 55.000, 66.000, 83.000, 93.000, 99.000, 100.000, 101.000, 105.000, 130.000, 135.000, 136.000, 137.000, 62.485, 40.414, 50.917, 68.750, 54.745, 50.643, 57.206, 59.421, 45.928, 48.357, 45.262, 42.023, 37.579, 41.912, 58.530, 50.789, 42.314, 55.191, 62.649, 43.902, 64.280, 59.100, 68.440, 66.071, 55.088, 17876956.000, 2883315.000, 56602560.000, 14785584.000, 6377619.000, 12350771.000, 926317.000, 6007797.000, 2491346.000, 3548753.000, 2042865.000, 3146381.000, 3201488.000, 1517453.000, 1426095.000, 30144317.000, 1165790.000, 940080.000, 1555876.000, 8025700.000, 2227000.000, 662850.000, 157553000.000, 2252965.000, 5439568.000, 5911.315, 2677.326, 2108.944, 11367.161, 3939.979, 2144.115, 2627.009, 5586.539, 1397.717, 3522.111, 3048.303, 2428.238, 1840.367, 2194.926, 2898.531, 3478.126, 3112.364, 2480.380, 1952.309, 3758.523, 3081.960, 3023.272, 13990.482, 5716.767, 7689.800
Americas 1957 5.000, 12.000, 15.000, 21.000, 24.000, 26.000, 30.000, 33.000, 37.000, 38.000, 40.000, 51.000, 54.000, 55.000, 66.000, 83.000, 93.000, 99.000, 100.000, 101.000, 105.000, 130.000, 135.000, 136.000, 137.000, 64.399, 41.890, 53.285, 69.960, 56.074, 55.118, 60.026, 62.325, 49.828, 51.356, 48.570, 44.142, 40.696, 44.665, 62.610, 55.190, 45.432, 59.201, 63.196, 46.263, 68.540, 61.800, 69.490, 67.044, 57.907, 19610538.000, 3211738.000, 65551171.000, 17010154.000, 7048426.000, 14485993.000, 1112300.000, 6640752.000, 2923186.000, 4058385.000, 2355805.000, 3640876.000, 3507701.000, 1770390.000, 1535090.000, 35015548.000, 1358828.000, 1063506.000, 1770902.000, 9146100.000, 2260000.000, 764900.000, 171984000.000, 2424959.000, 6702668.000, 6856.856, 2127.686, 2487.366, 12489.950, 4315.623, 2323.806, 2990.011, 6092.174, 1544.403, 3780.547, 3421.523, 2617.156, 1726.888, 2220.488, 4756.526, 4131.547, 3457.416, 2961.801, 2046.155, 4245.257, 3907.156, 4100.393, 14847.127, 6150.773, 9802.467
Americas 1962 5.000, 12.000, 15.000, 21.000, 24.000, 26.000, 30.000, 33.000, 37.000, 38.000, 40.000, 51.000, 54.000, 55.000, 66.000, 83.000, 93.000, 99.000, 100.000, 101.000, 105.000, 130.000, 135.000, 136.000, 137.000, 65.142, 43.428, 55.665, 71.300, 57.924, 57.863, 62.842, 65.246, 53.459, 54.640, 52.307, 46.954, 43.590, 48.041, 65.610, 58.299, 48.632, 61.817, 64.361, 49.096, 69.620, 64.900, 70.210, 68.253, 60.770, 21283783.000, 3593918.000, 76039390.000, 18985849.000, 7961258.000, 17009885.000, 1345187.000, 7254373.000, 3453434.000, 4681707.000, 2747687.000, 4208858.000, 3880130.000, 2090162.000, 1665128.000, 41121485.000, 1590597.000, 1215725.000, 2009813.000, 10516500.000, 2448046.000, 887498.000, 186538000.000, 2598466.000, 8143375.000, 7133.166, 2180.973, 3336.586, 13462.486, 4519.094, 2492.351, 3460.937, 5180.756, 1662.137, 4086.114, 3776.804, 2750.364, 1796.589, 2291.157, 5246.108, 4581.609, 3634.364, 3536.540, 2148.027, 4957.038, 5108.345, 4997.524, 16173.146, 5603.358, 8422.974
Americas 1967 5.000, 12.000, 15.000, 21.000, 24.000, 26.000, 30.000, 33.000, 37.000, 38.000, 40.000, 51.000, 54.000, 55.000, 66.000, 83.000, 93.000, 99.000, 100.000, 101.000, 105.000, 130.000, 135.000, 136.000, 137.000, 65.634, 45.032, 57.632, 72.130, 60.523, 59.963, 65.424, 68.290, 56.751, 56.678, 55.855, 50.016, 46.243, 50.924, 67.510, 60.110, 51.884, 64.071, 64.951, 51.445, 71.100, 65.400, 70.760, 68.468, 63.479, 22934225.000, 4040665.000, 88049823.000, 20819767.000, 8858908.000, 19764027.000, 1588717.000, 8139332.000, 4049146.000, 5432424.000, 3232927.000, 4690773.000, 4318137.000, 2500689.000, 1861096.000, 47995559.000, 1865490.000, 1405486.000, 2287985.000, 12132200.000, 2648961.000, 960155.000, 198712000.000, 2748579.000, 9709552.000, 8052.953, 2586.886, 3429.864, 16076.588, 5106.654, 2678.730, 4161.728, 5690.268, 1653.723, 4579.074, 4358.595, 3242.531, 1452.058, 2538.269, 6124.703, 5754.734, 4643.394, 4421.009, 2299.376, 5788.093, 6929.278, 5621.368, 19530.366, 5444.620, 9541.474
Americas 1972 5.000, 12.000, 15.000, 21.000, 24.000, 26.000, 30.000, 33.000, 37.000, 38.000, 40.000, 51.000, 54.000, 55.000, 66.000, 83.000, 93.000, 99.000, 100.000, 101.000, 105.000, 130.000, 135.000, 136.000, 137.000, 67.065, 46.714, 59.504, 72.880, 63.441, 61.623, 67.849, 70.723, 59.631, 58.796, 58.207, 53.738, 48.042, 53.884, 69.000, 62.361, 55.151, 66.216, 65.815, 55.448, 72.160, 65.900, 71.340, 68.673, 65.712, 24779799.000, 4565872.000, 100840058.000, 22284500.000, 9717524.000, 22542890.000, 1834796.000, 8831348.000, 4671329.000, 6298651.000, 3790903.000, 5149581.000, 4698301.000, 2965146.000, 1997616.000, 55984294.000, 2182908.000, 1616384.000, 2614104.000, 13954700.000, 2847132.000, 975199.000, 209896000.000, 2829526.000, 11515649.000, 9443.039, 2980.331, 4985.711, 18970.571, 5494.024, 3264.660, 5118.147, 5305.445, 2189.874, 5280.995, 4520.246, 4031.408, 1654.457, 2529.842, 7433.889, 6809.407, 4688.593, 5364.250, 2523.338, 5937.827, 9123.042, 6619.551, 21806.036, 5703.409, 10505.260
Americas 1977 5.000, 12.000, 15.000, 21.000, 24.000, 26.000, 30.000, 33.000, 37.000, 38.000, 40.000, 51.000, 54.000, 55.000, 66.000, 83.000, 93.000, 99.000, 100.000, 101.000, 105.000, 130.000, 135.000, 136.000, 137.000, 68.481, 50.023, 61.489, 74.210, 67.052, 63.837, 70.750, 72.649, 61.788, 61.310, 56.696, 56.029, 49.923, 57.402, 70.110, 65.032, 57.470, 68.681, 66.353, 58.447, 73.440, 68.300, 73.380, 69.481, 67.456, 26983828.000, 5079716.000, 114313951.000, 23796400.000, 10599793.000, 25094412.000, 2108457.000, 9537988.000, 5302800.000, 7278866.000, 4282586.000, 5703430.000, 4908554.000, 3055235.000, 2156814.000, 63759976.000, 2554598.000, 1839782.000, 2984494.000, 15990099.000, 3080828.000, 1039009.000, 220239000.000, 2873520.000, 13503563.000, 10079.027, 3548.098, 6660.119, 22090.883, 4756.764, 3815.808, 5926.877, 6380.495, 2681.989, 6679.623, 5138.922, 4879.993, 1874.299, 3203.208, 6650.196, 7674.929, 5486.371, 5351.912, 3248.373, 6281.291, 9770.525, 7899.554, 24072.632, 6504.340, 13143.951
Americas 1982 5.000, 12.000, 15.000, 21.000, 24.000, 26.000, 30.000, 33.000, 37.000, 38.000, 40.000, 51.000, 54.000, 55.000, 66.000, 83.000, 93.000, 99.000, 100.000, 101.000, 105.000, 130.000, 135.000, 136.000, 137.000, 69.942, 53.859, 63.336, 75.760, 70.565, 66.653, 73.450, 73.717, 63.727, 64.342, 56.604, 58.137, 51.461, 60.909, 71.210, 67.405, 59.298, 70.472, 66.874, 61.406, 73.750, 68.832, 74.650, 70.805, 68.557, 29341374.000, 5642224.000, 128962939.000, 25201900.000, 11487112.000, 27764644.000, 2424367.000, 9789224.000, 5968349.000, 8365850.000, 4474873.000, 6395630.000, 5198399.000, 3669448.000, 2298309.000, 71640904.000, 2979423.000, 2036305.000, 3366439.000, 18125129.000, 3279001.000, 1116479.000, 232187835.000, 2953997.000, 15620766.000, 8997.897, 3156.510, 7030.836, 22898.792, 5095.666, 4397.576, 5262.735, 7316.918, 2861.092, 7213.791, 4098.344, 4820.495, 2011.160, 3121.761, 6068.051, 9611.148, 3470.338, 7009.602, 4258.504, 6434.502, 10330.989, 9119.529, 25009.559, 6920.223, 11152.410
Americas 1987 5.000, 12.000, 15.000, 21.000, 24.000, 26.000, 30.000, 33.000, 37.000, 38.000, 40.000, 51.000, 54.000, 55.000, 66.000, 83.000, 93.000, 99.000, 100.000, 101.000, 105.000, 130.000, 135.000, 136.000, 137.000, 70.774, 57.251, 65.205, 76.860, 72.492, 67.768, 74.752, 74.174, 66.046, 67.231, 63.154, 60.782, 53.636, 64.492, 71.770, 69.498, 62.008, 71.523, 67.378, 64.134, 74.630, 69.582, 75.020, 71.918, 70.190, 31620918.000, 6156369.000, 142938076.000, 26549700.000, 12463354.000, 30964245.000, 2799811.000, 10239839.000, 6655297.000, 9545158.000, 4842194.000, 7326406.000, 5756203.000, 4372203.000, 2326606.000, 80122492.000, 3344353.000, 2253639.000, 3886512.000, 20195924.000, 3444468.000, 1191336.000, 242803533.000, 3045153.000, 17910182.000, 9139.671, 2753.691, 7807.096, 26626.515, 5547.064, 4903.219, 5629.915, 7532.925, 2899.842, 6481.777, 4140.442, 4246.486, 1823.016, 3023.097, 6351.237, 8688.156, 2955.984, 7034.779, 3998.876, 6360.943, 12281.342, 7388.598, 29884.350, 7452.399, 9883.585
Americas 1992 5.000, 12.000, 15.000, 21.000, 24.000, 26.000, 30.000, 33.000, 37.000, 38.000, 40.000, 51.000, 54.000, 55.000, 66.000, 83.000, 93.000, 99.000, 100.000, 101.000, 105.000, 130.000, 135.000, 136.000, 137.000, 71.868, 59.957, 67.057, 77.950, 74.126, 68.421, 75.713, 74.414, 68.457, 69.613, 66.798, 63.373, 55.089, 66.399, 71.766, 71.455, 65.843, 72.462, 68.225, 66.458, 73.911, 69.862, 76.090, 72.752, 71.150, 33958947.000, 6893451.000, 155975974.000, 28523502.000, 13572994.000, 34202721.000, 3173216.000, 10723260.000, 7351181.000, 10748394.000, 5274649.000, 8486949.000, 6326682.000, 5077347.000, 2378618.000, 88111030.000, 4017939.000, 2484997.000, 4483945.000, 22430449.000, 3585176.000, 1183669.000, 256894189.000, 3149262.000, 20265563.000, 9308.419, 2961.700, 6950.283, 26342.884, 7596.126, 5444.649, 6160.416, 5592.844, 3044.214, 7103.703, 4444.232, 4439.451, 1456.310, 3081.695, 7404.924, 9472.384, 2170.152, 6618.743, 4196.411, 4446.381, 14641.587, 7370.991, 32003.932, 8137.005, 10733.926
Americas 1997 5.000, 12.000, 15.000, 21.000, 24.000, 26.000, 30.000, 33.000, 37.000, 38.000, 40.000, 51.000, 54.000, 55.000, 66.000, 83.000, 93.000, 99.000, 100.000, 101.000, 105.000, 130.000, 135.000, 136.000, 137.000, 73.275, 62.050, 69.388, 78.610, 75.816, 70.313, 77.260, 76.151, 69.957, 72.312, 69.535, 66.322, 56.671, 67.659, 72.262, 73.670, 68.426, 73.738, 69.400, 68.386, 74.917, 69.465, 76.810, 74.223, 72.146, 36203463.000, 7693188.000, 168546719.000, 30305843.000, 14599929.000, 37657830.000, 3518107.000, 10983007.000, 7992357.000, 11911819.000, 5783439.000, 9803875.000, 6913545.000, 5867957.000, 2531311.000, 95895146.000, 4609572.000, 2734531.000, 5154123.000, 24748122.000, 3759430.000, 1138101.000, 272911760.000, 3262838.000, 22374398.000, 10967.282, 3326.143, 7957.981, 28954.926, 10118.053, 6117.362, 6677.045, 5431.990, 3614.101, 7429.456, 5154.825, 4684.314, 1341.727, 3160.455, 7121.925, 9767.298, 2253.023, 7113.692, 4247.400, 5838.348, 16999.433, 8792.573, 35767.433, 9230.241, 10165.495
Americas 2002 5.000, 12.000, 15.000, 21.000, 24.000, 26.000, 30.000, 33.000, 37.000, 38.000, 40.000, 51.000, 54.000, 55.000, 66.000, 83.000, 93.000, 99.000, 100.000, 101.000, 105.000, 130.000, 135.000, 136.000, 137.000, 74.340, 63.883, 71.006, 79.770, 77.860, 71.682, 78.123, 77.158, 70.847, 74.173, 70.734, 68.978, 58.137, 68.565, 72.047, 74.902, 70.836, 74.712, 70.755, 69.906, 77.778, 68.976, 77.310, 75.307, 72.766, 38331121.000, 8445134.000, 179914212.000, 31902268.000, 15497046.000, 41008227.000, 3834934.000, 11226999.000, 8650322.000, 12921234.000, 6353681.000, 11178650.000, 7607651.000, 6677328.000, 2664659.000, 102479927.000, 5146848.000, 2990875.000, 5884491.000, 26769436.000, 3859606.000, 1101832.000, 287675526.000, 3363085.000, 24287670.000, 8797.641, 3413.263, 8131.213, 33328.965, 10778.784, 5755.260, 7723.447, 6340.647, 4563.808, 5773.045, 5351.569, 4858.347, 1270.365, 3099.729, 6994.775, 10742.441, 2474.549, 7356.032, 3783.674, 5909.020, 18855.606, 11460.600, 39097.100, 7727.002, 8605.048
Americas 2007 5.000, 12.000, 15.000, 21.000, 24.000, 26.000, 30.000, 33.000, 37.000, 38.000, 40.000, 51.000, 54.000, 55.000, 66.000, 83.000, 93.000, 99.000, 100.000, 101.000, 105.000, 130.000, 135.000, 136.000, 137.000, 75.320, 65.554, 72.390, 80.653, 78.553, 72.889, 78.782, 78.273, 72.235, 74.994, 71.878, 70.259, 60.916, 70.198, 72.567, 76.195, 72.899, 75.537, 71.752, 71.421, 78.746, 69.819, 78.242, 76.384, 73.747, 40301927.000, 9119152.000, 190010647.000, 33390141.000, 16284741.000, 44227550.000, 4133884.000, 11416987.000, 9319622.000, 13755680.000, 6939688.000, 12572928.000, 8502814.000, 7483763.000, 2780132.000, 108700891.000, 5675356.000, 3242173.000, 6667147.000, 28674757.000, 3942491.000, 1056608.000, 301139947.000, 3447496.000, 26084662.000, 12779.380, 3822.137, 9065.801, 36319.235, 13171.639, 7006.580, 9645.061, 8948.103, 6025.375, 6873.262, 5728.354, 5186.050, 1201.637, 3548.331, 7320.880, 11977.575, 2749.321, 9809.186, 4172.838, 7408.906, 19328.709, 18008.509, 42951.653, 10611.463, 11415.806
Oceania 1952 6.00, 92.00, 69.12, 69.39, 8691212.00, 1994794.00, 10039.60, 10556.58
Oceania 1957 6.00, 92.00, 70.33, 70.26, 9712569.00, 2229407.00, 10949.65, 12247.40
Oceania 1962 6.00, 92.00, 70.93, 71.24, 10794968.00, 2488550.00, 12217.23, 13175.68
Oceania 1967 6.00, 92.00, 71.10, 71.52, 11872264.00, 2728150.00, 14526.12, 14463.92
Oceania 1972 6.00, 92.00, 71.93, 71.89, 13177000.00, 2929100.00, 16788.63, 16046.04
Oceania 1977 6.00, 92.00, 73.49, 72.22, 14074100.00, 3164900.00, 18334.20, 16233.72
Oceania 1982 6.00, 92.00, 74.74, 73.84, 15184200.00, 3210650.00, 19477.01, 17632.41
Oceania 1987 6.00, 92.00, 76.32, 74.32, 16257249.00, 3317166.00, 21888.89, 19007.19
Oceania 1992 6.00, 92.00, 77.56, 76.33, 17481977.00, 3437674.00, 23424.77, 18363.32
Oceania 1997 6.00, 92.00, 78.83, 77.55, 18565243.00, 3676187.00, 26997.94, 21050.41
Oceania 2002 6.00, 92.00, 80.37, 79.11, 19546792.00, 3908037.00, 30687.75, 23189.80
Oceania 2007 6.000, 92.000, 81.235, 80.204, 20434176.000, 4115771.000, 34435.367, 25185.009
lm_data <- gapminder %>%
  nest(data = -c(continent, year)) %>%
  mutate(
    fit = map(data, ~lm(lifeExp ~ log(gdpPercap), data = .x)),
    tidy_out = map(fit, tidy)
  ) %>%
  unnest(cols = tidy_out) %>%
  select(-fit, -data) %>%
  filter(term != "(Intercept)", continent != "Oceania")
lm_data
continent year term estimate std.error statistic p.value
Asia 1952 log(gdpPercap) 4.159673 1.2507050 3.325863 0.0022766
Asia 1957 log(gdpPercap) 4.171219 1.2795470 3.259918 0.0027070
Asia 1962 log(gdpPercap) 4.592523 1.2352003 3.718039 0.0007945
Asia 1967 log(gdpPercap) 4.501820 1.1532145 3.903714 0.0004771
Asia 1972 log(gdpPercap) 4.444768 1.0080337 4.409344 0.0001158
Asia 1977 log(gdpPercap) 4.872457 1.0262141 4.747992 0.0000442
Asia 1982 log(gdpPercap) 4.779256 0.8523760 5.606982 0.0000038
Asia 1987 log(gdpPercap) 5.174537 0.7267098 7.120500 0.0000001
Asia 1992 log(gdpPercap) 5.088508 0.6492224 7.837850 0.0000000
Asia 1997 log(gdpPercap) 5.112844 0.6275664 8.147097 0.0000000
Asia 2002 log(gdpPercap) 5.443311 0.6963806 7.816574 0.0000000
Asia 2007 log(gdpPercap) 5.157259 0.6938654 7.432651 0.0000000
Europe 1952 log(gdpPercap) 9.004054 0.9871368 9.121384 0.0000000
Europe 1957 log(gdpPercap) 7.448632 0.9450840 7.881450 0.0000000
Europe 1962 log(gdpPercap) 5.912350 0.8530460 6.930868 0.0000002
Europe 1967 log(gdpPercap) 5.430009 0.7964583 6.817694 0.0000002
Europe 1972 log(gdpPercap) 4.506842 0.7571422 5.952438 0.0000021
Europe 1977 log(gdpPercap) 4.487710 0.7559818 5.936266 0.0000022
Europe 1982 log(gdpPercap) 4.370204 0.7910998 5.524214 0.0000066
Europe 1987 log(gdpPercap) 4.144326 0.7523846 5.508254 0.0000069
Europe 1992 log(gdpPercap) 3.483074 0.5450157 6.390777 0.0000006
Europe 1997 log(gdpPercap) 3.759339 0.5068161 7.417560 0.0000000
Europe 2002 log(gdpPercap) 3.739622 0.4452979 8.398022 0.0000000
Europe 2007 log(gdpPercap) 4.226891 0.5246111 8.057190 0.0000000
Africa 1952 log(gdpPercap) 2.337502 0.9714374 2.406230 0.0198559
Africa 1957 log(gdpPercap) 2.687954 1.0555398 2.546521 0.0140063
Africa 1962 log(gdpPercap) 2.756273 1.0584146 2.604153 0.0120951
Africa 1967 log(gdpPercap) 3.068467 0.9882165 3.105056 0.0031303
Africa 1972 log(gdpPercap) 3.803497 0.9624671 3.951820 0.0002439
Africa 1977 log(gdpPercap) 4.509730 0.9200603 4.901559 0.0000104
Africa 1982 log(gdpPercap) 5.614144 0.8975984 6.254628 0.0000001
Africa 1987 log(gdpPercap) 6.478642 0.8977865 7.216239 0.0000000
Africa 1992 log(gdpPercap) 7.332773 1.1048259 6.637039 0.0000000
Africa 1997 log(gdpPercap) 6.885190 1.0361925 6.644702 0.0000000
Africa 2002 log(gdpPercap) 5.161485 1.2133718 4.253836 0.0000920
Africa 2007 log(gdpPercap) 4.260876 1.1906864 3.578504 0.0007793
Americas 1952 log(gdpPercap) 10.440891 2.7158881 3.844375 0.0008273
Americas 1957 log(gdpPercap) 10.324214 2.3966385 4.307789 0.0002614
Americas 1962 log(gdpPercap) 10.392712 2.2729647 4.572316 0.0001352
Americas 1967 log(gdpPercap) 8.723394 1.9798350 4.406122 0.0002046
Americas 1972 log(gdpPercap) 8.472344 1.7820437 4.754285 0.0000859
Americas 1977 log(gdpPercap) 8.214429 1.8265673 4.497195 0.0001630
Americas 1982 log(gdpPercap) 8.429255 1.5629293 5.393241 0.0000177
Americas 1987 log(gdpPercap) 7.098118 1.1351988 6.252753 0.0000022
Americas 1992 log(gdpPercap) 6.055116 0.8945236 6.769096 0.0000007
Americas 1997 log(gdpPercap) 5.394425 0.8590215 6.279732 0.0000021
Americas 2002 log(gdpPercap) 5.054808 0.8442795 5.987127 0.0000042
Americas 2007 log(gdpPercap) 4.494066 0.7515981 5.979347 0.0000043
ggplot2::ggplot(lm_data) +
  aes(
    x = year, y = estimate,
    ymin = estimate - 1.96*std.error,
    ymax = estimate + 1.96*std.error,
    color = continent
  ) +
  geom_pointrange(
    position = position_dodge(width = 1)
  ) +
  scale_x_continuous(
    breaks = unique(gapminder$year)
  ) +
  theme(legend.position = "top")

# The ggdist package provides many different visualizations of uncertainty
# Half-eyes
lm_data %>%
  filter(year == 1952) %>%
  mutate(
    continent =
      fct_reorder(continent, estimate)
  ) %>%
  ggplot2::ggplot(aes(x = estimate, y = continent)) +
  ggdist::stat_dist_halfeye(
    aes(dist = dist_normal(
      mu = estimate, sigma = std.error
    )),
    point_size = 4
  )

# Gradients interval
lm_data %>%
  filter(year == 1952) %>%
  mutate(
    continent =
      fct_reorder(continent, estimate)
  ) %>%
  ggplot2::ggplot(aes(x = estimate, y = continent)) +
  ggdist::stat_dist_gradientinterval(
    aes(dist = dist_normal(
      mu = estimate, sigma = std.error
    )),
    point_size = 4,
    fill = "skyblue"
  )

# Dots interval
lm_data %>%
  filter(year == 1952) %>%
  mutate(
    continent =
      fct_reorder(continent, estimate)
  ) %>%
 ggplot2::ggplot(aes(x = estimate, y = continent)) +
  ggdist::stat_dist_dotsinterval(
    aes(dist = dist_normal(
      mu = estimate, sigma = std.error
    )),
    point_size = 4,
    fill = "skyblue",
    quantiles = 20
  )

lm_data %>%
  filter(year == 1952) %>%
  mutate(
    continent =
      fct_reorder(continent, estimate)
  ) %>%
  ggplot2::ggplot(aes(x = estimate, y = continent)) +
  ggdist::stat_dist_dotsinterval(
    aes(dist = dist_normal(
      mu = estimate, sigma = std.error
    )),
    point_size = 4,
    fill = "skyblue",
    quantiles = 10
)

4.13 Dimension reduction

blue_jays <- read_csv("input/blue_jays.csv")

blue_jays %>% 
  ggplot() +
  aes(skull_size_mm, head_length_mm) + 
  geom_point(aes(color = sex))

# Plot with scaling
blue_jays %>% 
  # scale all numeric columns
  mutate(across(where(is.numeric), scale)) %>%
  ggplot() +
  aes(skull_size_mm, head_length_mm) + 
  geom_point(aes(color = sex))

# We perform a PCA with prcomp()
pca_fit <- blue_jays %>% 
  select(where(is.numeric)) %>% # retain only numeric columns
  scale() %>%                   # scale to zero mean and unit variance
  prcomp()                      # do PCA

# Then we add PC coordinates into original dataset and plot
pca_fit %>%
  # add PCs to the original dataset
  augment(blue_jays) %>%
  ggplot(aes(.fittedPC1, .fittedPC2)) +
  geom_point(aes(color = sex))

# Plot PC 2 against PC 1
pca_fit %>%
  # add PCs to the original dataset
  augment(blue_jays) %>%
  ggplot(aes(.fittedPC1, .fittedPC2)) +
  geom_point(aes(color = sex))

# Plot PC 3 against PC 2
pca_fit %>%
  # add PCs to the original dataset
  augment(blue_jays) %>%
  ggplot(aes(.fittedPC2, .fittedPC3)) +
  geom_point(aes(color = sex))

# Plot the rotation matrix
arrow_style <- arrow(
  angle = 20, length = grid::unit(8, "pt"),
  ends = "first", type = "closed"
)
pca_fit %>%
  # extract rotation matrix
  tidy(matrix = "rotation") %>%
  pivot_wider(
    names_from = "PC", values_from = "value",
    names_prefix = "PC"
  ) %>%
  ggplot(aes(PC1, PC2)) +
  geom_segment(
    xend = 0, yend = 0,
    arrow = arrow_style
  ) +
  geom_text(aes(label = column), hjust = 1) +
  xlim(-1.5, 0.5) + ylim(-1, 1) + 
  coord_fixed()

# Plot the variance explained
pca_fit %>%
  # extract eigenvalues
  tidy(matrix = "eigenvalues") %>%
  ggplot(aes(PC, percent)) + 
  geom_col() + 
  scale_x_continuous(
    # create one axis tick per PC
    breaks = 1:6
  ) +
  scale_y_continuous(
    name = "variance explained",
    # format y axis ticks as percent values
    label = scales::label_percent(accuracy = 1)
  )

4.14 Clustering

ggplot(iris, aes(Petal.Length, Petal.Width, color = Species)) +
  geom_point()

# We perform k-means clustering with kmeans()
km_fit <- iris %>% 
  select(where(is.numeric)) %>%
  kmeans(
    centers = 3,  # number of cluster centers
    nstart = 10   # number of independent restarts of the algorithm
  )
km_fit
## K-means clustering with 3 clusters of sizes 62, 38, 50
## 
## Cluster means:
##   Sepal.Length Sepal.Width Petal.Length Petal.Width
## 1     5.901613    2.748387     4.393548    1.433871
## 2     6.850000    3.073684     5.742105    2.071053
## 3     5.006000    3.428000     1.462000    0.246000
## 
## Clustering vector:
##   [1] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
##  [38] 3 3 3 3 3 3 3 3 3 3 3 3 3 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
##  [75] 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 2 2 2 2 1 2 2 2 2
## [112] 2 2 1 1 2 2 2 2 1 2 1 2 1 2 2 1 1 2 2 2 2 2 1 2 2 2 2 1 2 2 2 1 2 2 2 1 2
## [149] 2 1
## 
## Within cluster sum of squares by cluster:
## [1] 39.82097 23.87947 15.15100
##  (between_SS / total_SS =  88.4 %)
## 
## Available components:
## 
## [1] "cluster"      "centers"      "totss"        "withinss"     "tot.withinss"
## [6] "betweenss"    "size"         "iter"         "ifault"

4.15 Visualizing Spatial Data

# data
texas_income <- readRDS("input/Texas_income.rds")

ggplot(texas_income) + 
  geom_sf()

# plot only Travis County
texas_income %>% 
  filter(county == "Travis") %>%
  ggplot() + 
  geom_sf()

# plot the ten richest counties
texas_income %>% 
  slice_max(median_income, n = 10) %>%
  ggplot() + 
  geom_sf()

# color counties by median income
texas_income %>%
  ggplot(aes(fill = median_income)) + 
  geom_sf()

# highlight the ten richest counties
texas_income %>% 
  mutate(
    top_ten = rank(desc(median_income)) <= 10
  ) %>%
  ggplot(aes(fill = top_ten)) + 
  geom_sf(color = "black", size = 0.1) +
  scale_fill_manual(
    name = NULL,
    values = c(
      `TRUE` = "#D55E00",
      `FALSE` = "#E8EEF9"
    ),
    breaks = c(TRUE),
    labels = "top-10 median income"
  ) +
  theme_minimal_grid(11)

ggplot(texas_income) + 
  geom_sf(
    aes(fill = median_income),
    color = "black", size = 0.1
  ) +
  colorspace::scale_fill_continuous_sequential(
    palette = "Blues", rev = TRUE
  ) +
  theme_minimal_grid(11)

# We can customize the projection with coord_sf()
ggplot(texas_income) + 
  geom_sf(
    aes(fill = median_income),
    color = "black", size = 0.1
  ) +
  colorspace::scale_fill_continuous_sequential(
    palette = "Blues", rev = TRUE
  ) +
  coord_sf(
    # Texas Centric Albers Equal Area
    crs = 3083
  ) +
  theme_minimal_grid(11)

ggplot(texas_income) + 
  geom_sf(
    aes(fill = median_income),
    color = "black", size = 0.1
  ) +
  colorspace::scale_fill_continuous_sequential(
    palette = "Blues", rev = TRUE
  ) +
  coord_sf(
    # Texas Centric Lambert Conformal Conic
    crs = 32139
  ) + 
  theme_minimal_grid(11)

ggplot(texas_income) + 
  geom_sf(
    aes(fill = median_income),
    color = "black", size = 0.1
  ) +
  colorspace::scale_fill_continuous_sequential(
    palette = "Blues", rev = TRUE
  ) +
  coord_sf(
    # Web Mercator (Google Maps)
    crs = 3857
  ) + 
  theme_minimal_grid(11)

4.16 Color spaces and color-vision deficiency (no R code)

4.17 Redundant coding, text annotations

# all data
tech_stocks <- read_csv("input/tech_stocks.csv") %>%
  mutate(date = ymd(date))

# Most recent values only
tech_stocks_last <- tech_stocks %>%
  filter(date == max(date))
tech_stocks_last
company ticker date price index_price price_indexed
Alphabet GOOG 2017-06-02 975.60 285.20 342.0757
Apple AAPL 2017-06-02 155.45 80.14 193.9730
Facebook FB 2017-06-02 153.61 27.72 554.1486
Microsoft MSFT 2017-06-02 71.76 28.45 252.2320
# Secondary axis trick
ggplot(tech_stocks) +
  aes(x = date, y = price_indexed) +
  geom_line(aes(color = company), na.rm = TRUE) +
  scale_x_date(
    limits = c(
      ymd("2012-06-01"),
      ymd("2017-05-31")
    ),
    expand = c(0, 0)
  ) + 
  scale_y_continuous(
    limits = c(0, 560),
    expand = c(0, 0),
    sec.axis = dup_axis(
      breaks = tech_stocks_last$price_indexed,
      labels = tech_stocks_last$company,
      name = NULL
    )
  ) +
  guides(color = "none")

# Manual labeling with geom_text()
# Manually create table with label positions
iris_labels <- tibble(
  Species = c("setosa", "virginica", "versicolor"),
  Sepal.Width = c(4.2, 3.76, 2.08),
  Sepal.Length = c(5.7, 7, 5.1),
  label = c("Iris setosa", "Iris virginica", "Iris versicolor"),
  hjust = c(0, 0.5, 0),
  vjust = c(0, 0.5, 1)
)
iris_labels
Species Sepal.Width Sepal.Length label hjust vjust
setosa 4.20 5.7 Iris setosa 0.0 0.0
virginica 3.76 7.0 Iris virginica 0.5 0.5
versicolor 2.08 5.1 Iris versicolor 0.0 1.0
# And plotting
ggplot(iris) +
  aes(Sepal.Length, Sepal.Width, color = Species) +
  geom_point(aes(shape = Species)) +
  geom_text(
    data = iris_labels,
    aes(
      label = label,
      hjust = hjust, vjust = vjust
    ),
    size = 14/.pt # 14pt font
  ) +
  stat_ellipse(size = 0.5) + # add ellipses 
  guides(color = "none", shape = "none")

# Automatic labeling with geom_text_repel()
mtcars_named <- mtcars %>%
  rownames_to_column("car") %>% # rownames to column car 
  select(car, weight = wt, mpg)
mtcars_named
car weight mpg
Mazda RX4 2.620 21.0
Mazda RX4 Wag 2.875 21.0
Datsun 710 2.320 22.8
Hornet 4 Drive 3.215 21.4
Hornet Sportabout 3.440 18.7
Valiant 3.460 18.1
Duster 360 3.570 14.3
Merc 240D 3.190 24.4
Merc 230 3.150 22.8
Merc 280 3.440 19.2
Merc 280C 3.440 17.8
Merc 450SE 4.070 16.4
Merc 450SL 3.730 17.3
Merc 450SLC 3.780 15.2
Cadillac Fleetwood 5.250 10.4
Lincoln Continental 5.424 10.4
Chrysler Imperial 5.345 14.7
Fiat 128 2.200 32.4
Honda Civic 1.615 30.4
Toyota Corolla 1.835 33.9
Toyota Corona 2.465 21.5
Dodge Challenger 3.520 15.5
AMC Javelin 3.435 15.2
Camaro Z28 3.840 13.3
Pontiac Firebird 3.845 19.2
Fiat X1-9 1.935 27.3
Porsche 914-2 2.140 26.0
Lotus Europa 1.513 30.4
Ford Pantera L 3.170 15.8
Ferrari Dino 2.770 19.7
Maserati Bora 3.570 15.0
Volvo 142E 2.780 21.4
ggplot(mtcars_named, aes(weight, mpg)) +
  geom_point() +
  geom_text_repel(
    aes(label = car),
    max.overlaps = Inf
  )

set.seed(42)
mtcars_named %>%
  mutate(
    # randomly exclude 50% of the labels
    car = ifelse(runif(n()) < 0.5, "", car)
  ) %>% 
  ggplot(aes(weight, mpg)) +
  geom_point() +
  geom_text_repel(
    aes(label = car),
    max.overlaps = Inf,
    box.padding = 0.7 # controls how far labels are placed from data points 
  )

4.18 Interactive plots

# hovering displays species names
# iris_scatter <- ggplot(iris) + 
#   aes(
#     Sepal.Length, Sepal.Width,
#     color = Species
#   ) +
#   geom_point_interactive(
#     aes(tooltip = Species)
#   )
# 
# girafe(
#   ggobj = iris_scatter,
#   width_svg = 6,
#   height_svg = 6*0.618
# )
# 
# # Styling happens via Cascading Style Sheets (CSS)
# girafe(
#   ggobj = iris_scatter,
#   width_svg = 6,
#   height_svg = 6*0.618,
#   options = list(
#     opts_tooltip(
# css = "background: #F5F5F5; color: #191970;"
#     )
#   )
# )
# 
# # Select multiple points at once with data_id aesthetic
# iris_scatter <- ggplot(iris) + 
#   aes(
#     Sepal.Length, Sepal.Width,
#     color = Species
#   ) +
#   geom_point_interactive(
#     aes(data_id = Species),
#     size = 2
#   )
# 
# girafe(
#   ggobj = iris_scatter,
#   width_svg = 6,
#   height_svg = 6*0.618
# )
# 
# # Via CSS
# girafe(
#   ggobj = iris_scatter,
#   width_svg = 6,
#   height_svg = 6*0.618,
#   options = list(
#     opts_hover(css = "fill: #202020;"),
#     opts_hover_inv(css = "opacity: 0.2;")
#   )
# )
# 
# # Interactive map example
# # load data
# US_states <- readRDS(url("https://wilkelab.org/SDS375/datasets/US_states.rds"))
# US_states
# 
# # plotting
# US_map <- US_states %>%
#   ggplot() +
#   geom_sf_interactive(
#     aes(data_id = name, tooltip = name)
#   ) +
#   theme_void()
# 
# girafe(
#   ggobj = US_map,
#   width_svg = 6,
#   height_svg = 6*0.618
# )
# 
# # Click to open a state's wikipedia page
# US_map <- US_states %>%
#   mutate( # JavaScript call to open website 
#     onclick = glue::glue(
# 'window.open(
# "https://en.wikipedia.org/wiki/{name}")')
#   ) %>%
#   ggplot() +
#   geom_sf_interactive(
#     aes(
#       data_id = name, tooltip = name,
#       onclick = onclick
#     )
#   ) +
#   theme_void()
# 
# girafe(
#   ggobj = US_map,
#   width_svg = 6,
#   height_svg = 6*0.618
# )

4.19 Handling overlapping points

# Contour lines
blue_jays %>%
  ggplot(aes(body_mass_g, head_length_mm)) +
  geom_density_2d() +
  geom_point() +
  theme_bw(14)

blue_jays %>%
  ggplot(aes(body_mass_g, head_length_mm)) +
  geom_density_2d(bins = 5) +
  geom_point() +
  theme_bw(14)

ggplot(blue_jays, aes(body_mass_g, head_length_mm)) +
  geom_density_2d_filled(bins = 5, alpha = 0.5) +
  geom_density_2d(bins = 5, color = "black", size = 0.2) +
  geom_point() +
  theme_bw(14)

# 2D histograms
ggplot(blue_jays, aes(body_mass_g, head_length_mm)) +
  geom_bin2d() +
  theme_bw(14)

ggplot(blue_jays, aes(body_mass_g, head_length_mm)) +
  geom_bin2d(binwidth = c(3, 3)) +
  theme_bw(14)

ggplot(blue_jays, aes(body_mass_g, head_length_mm)) +
  geom_bin2d(binwidth = c(1, 5)) +
  theme_bw(14)

ggplot(blue_jays, aes(body_mass_g, head_length_mm)) +
  geom_bin2d(binwidth = c(5, 1)) +
  theme_bw(14)

# Hex bins
ggplot(blue_jays, aes(body_mass_g, head_length_mm)) +
  geom_hex() +
  theme_bw(14)

ggplot(blue_jays, aes(body_mass_g, head_length_mm)) +
  geom_hex(bins = 15) +
  theme_bw(14)

ggplot(blue_jays, aes(body_mass_g, head_length_mm)) +
  geom_hex(bins = 10) +
  theme_bw(14)

4.20 Compound figures

# The patchwork package# 
# make first plot
p1 <- ggplot(mtcars) + 
  geom_point(aes(mpg, disp))
# make second plot
p2 <- ggplot(mtcars) + 
  aes(gear, disp, group = gear) +
  geom_boxplot()
# place plots side-by-side
p1 | p2

# make first plot
p1 <- ggplot(mtcars) + 
  geom_point(aes(mpg, disp))
# make second plot
p2 <- ggplot(mtcars) + 
  aes(gear, disp, group = gear) +
  geom_boxplot()
# place plots side-by-side
p1 | p2

# make first plot
p1 <- ggplot(mtcars) + 
  geom_point(aes(mpg, disp))
# make second plot
p2 <- ggplot(mtcars) + 
  aes(gear, disp, group = gear) +
  geom_boxplot()
# place plots on top of one-another
p1 / p2

# add a few more plots
p3 <- ggplot(mtcars) + 
  geom_smooth(aes(disp, qsec))
p4 <- ggplot(mtcars) + 
  geom_bar(aes(carb))
# make complex arrangement
(p1 | p2 | p3) / p4

# Plot annotations and themes
(p1 | p2 | p3) / p4 +
   plot_annotation(
     tag_levels = "a"
   )

(p1 | p2 | p3) / p4 +
  plot_annotation(
   tag_levels = "a"
  ) &
  theme_minimal_grid()

(p1 | p2 | p3) / p4 +
  plot_annotation(
   tag_levels = "a",
   title = "A plot about mtcars",
   subtitle = "With subtitle...",
   caption = "...and caption"
  ) &
  theme_minimal_grid()

4.21 Functions and functional programming

# Avoid hard-coding specific values
penguins %>%
  filter(species == "Gentoo") %>%
  ggplot() +
  aes(bill_length_mm, body_mass_g) +
  geom_point() +
  ggtitle("Species: Gentoo") +
  xlab("bill length (mm)") +
  ylab("body mass (g)") +
  theme_minimal_grid() +
  theme(plot.title.position = "plot")

# species = "Adelie" # value 
# species = "Chinstrap" # value 
species = "Gentoo" # value

penguins %>%
  filter(.data$species == .env$species) %>% #.data = column in df 
  ggplot() +                                #.env var en local env   
  aes(bill_length_mm, body_mass_g) +
  geom_point() +
  ggtitle(glue("Species: {species}")) +
  xlab("bill length (mm)") +
  ylab("body mass (g)") +
  theme_minimal_grid() +
  theme(plot.title.position = "plot")

# Define a function
make_plot <- function(species) {
  penguins %>%
    filter(.data$species == .env$species) %>%
    ggplot() +
    aes(bill_length_mm, body_mass_g) +
    geom_point() +
    ggtitle(glue("Species: {species}")) +
    xlab("bill length (mm)") +
    ylab("body mass (g)") +
    theme_minimal_grid() +
    theme(plot.title.position = "plot")
}
make_plot("Adelie")

make_plot("Chinstrap")

make_plot("Gentoo")

# Automate calling the function
species <- c("Adelie", "Chinstrap", "Gentoo")
plots <- map(species, make_plot) # map takes each element of the vector species and uses it as input for make_plot()

# It returns a list of created plots:  
plots[[1]] 

plots[[2]] 

plots[[3]] 

# `walk()` is like `map()` but doesn't return a value
# we use it only for side effects (such as printing)
walk(plots, print)

# Write a more general function
make_plot <- function(species) {
  penguins %>% # hard-coded dataset!
    filter(.data$species == .env$species) %>%
    ggplot() +
    aes(bill_length_mm, body_mass_g) +
    geom_point() +
    ggtitle(glue("Species: {species}")) +
    xlab("bill length (mm)") +
    ylab("body mass (g)") +
    theme_minimal_grid() +
    theme(plot.title.position = "plot")
}

make_plot2 <- function(data, species) {
  data %>%
    # filter no longer needed
    ggplot() +
    aes(bill_length_mm, body_mass_g) +
    geom_point() +
    ggtitle(glue("Species: {species}")) +
    xlab("bill length (mm)") +
    ylab("body mass (g)") +
    theme_minimal_grid() +
    theme(plot.title.position = "plot")
}
data_adelie <- penguins %>%
  filter(species == "Adelie")
make_plot2(data_adelie, species = "Adelie")

# Use these concepts in a tidy pipeline
penguins %>%
  nest(data = -species) %>%
  mutate(plots = map2(data, species, make_plot2)) %>% # map2() is like map() but for functions with 2 arguments 
  pull(plots) %>%
  walk(print)

4.22 Animations

# load data
gdp_ranked <- read_csv("input/gdp_ranked.csv") %>%
  mutate(rank = fct_rev(factor(rank)))

# Think of an animation as faceting by time
gdp_ranked %>%
  filter(year > 1985 & year %% 5 == 0) %>%
  ggplot(aes(gdp, rank)) +
  geom_col(aes(fill = country)) +
  facet_wrap(vars(year))

gdp_ranked %>%
  # gganimate uses the `group` aesthetic to track objects across frames
  ggplot(aes(gdp, rank, group = country)) + 
  geom_col(aes(fill = country)) +
  transition_states(year, transition_length = 5)

gdp_ranked %>%
  ggplot(aes(gdp, rank, group = country)) +
  geom_col(aes(fill = country)) +
  geom_text(
    aes(x = -200, label = country),
    hjust = 1, size = 14/.pt
  ) +
  xlim(-7000, 23000) +
  labs(title = "year: {closest_state}") +
  theme_minimal_vgrid(14, rel_small = 1) +
  theme(
    axis.text.y = element_blank(),
    axis.title.y = element_blank(),
    axis.ticks.y = element_blank(),
    axis.line.y = element_blank()
  ) + 
  guides(fill = "none") +
  transition_states(year, transition_length = 5)

selected <- c("China", "Japan",
  "United States", "Germany", "Brazil")
gdp_ranked %>%
  filter(country %in% selected) %>%
  ggplot(aes(year, gdp, color = country)) +
  geom_line() +
  geom_point(size = 3) +
  scale_y_log10() +
  transition_reveal(year)

gdp_ranked %>%
  filter(country %in% selected) %>%
  ggplot(aes(year, gdp, color = country)) +
  geom_line() +
  geom_point(size = 3) +
  geom_text_repel(
    aes(label = country),
    hjust = 0,
    nudge_x = 2,
    direction = "y",
    xlim = c(NA, Inf)
  ) +
  scale_y_log10() +
  guides(color = "none") +
  coord_cartesian(clip = "off") +
  theme(plot.margin = margin(7, 100, 7, 7)) +
  transition_reveal(year)
# p <- ggplot(iris, aes(x = Petal.Width, y = Petal.Length)) + 
#   geom_point()
# p
# 
# anim <- p + 
#   transition_states(Species,
#                     transition_length = 2,
#                     state_length = 1)
# 
# anim

5 Data visualisation using R, for researchers who don’t use R

5.1 Getting Started

# set default theme
theme_set(theme_minimal())

# load data
dat <- read_csv(file = "input/ldt_data.csv")
headTail(dat)
id age language rt_word rt_nonword acc_word acc_nonword
S001 22 1 379.46 516.82 99 90
S002 33 1 312.45 435.04 94 82
S003 23 1 404.94 458.5 96 87
S004 28 1 298.37 335.89 92 76
NA
S097 22 2 370.5 555.91 97 83
S098 29 2 331.15 532.29 93 77
S099 26 2 274.55 536.64 92 81
S100 43 2 351.22 601.34 95 83
# recode factor var language
# 
# Option 1 (mutate)
dat <- dat |>
  mutate(language = factor(
    x = language,
    levels = c(1, 2),
    labels = c("monolingual", "bilingual")
  ))

# Option 2 (within)
# dat <- within(dat, language <- factor(language, levels = c(1, 2), labels = c("monolingual", "bilingual")))

headTail(dat)
id age language rt_word rt_nonword acc_word acc_nonword
S001 22 monolingual 379.46 516.82 99 90
S002 33 monolingual 312.45 435.04 94 82
S003 23 monolingual 404.94 458.5 96 87
S004 28 monolingual 298.37 335.89 92 76
NA NA
S097 22 bilingual 370.5 555.91 97 83
S098 29 bilingual 331.15 532.29 93 77
S099 26 bilingual 274.55 536.64 92 81
S100 43 bilingual 351.22 601.34 95 83

5.2 Descriptive Statistics

# Demographic information
# Age mean, sd and counts
age_stats <- dat |> 
  group_by(language) |> 
  summarise(
  mean_age = mean(age),
  sd_age = sd(age),
  n_values = n()
)

# plotting bar graph
ggplot(dat, aes(x = language)) +
  geom_bar() +
  scale_x_discrete(
    name = "Language group",
    labels = c("Monolingual", "Bilingual")) +
  scale_y_continuous(
    name = "Number of participants",
    breaks = seq(0, 50, 10),
    expand = c(0, 0)
  ) +
  theme_minimal_hgrid(
    line_size = .3
  ) +
  theme(
    axis.line.x.bottom = element_line(color = "black"),
    axis.ticks = element_blank(),
    panel.grid = element_line(linetype = "dashed")
  )

# calculating pct
dat_percent <- dat |> 
  group_by(language) |> 
  count() |> 
  ungroup() |> 
  mutate(pct = (n/sum(n)*100))

# plotting hist
ggplot(dat_percent, aes(x = language, y = pct)) +
  geom_bar(stat = "identity")

ggplot(dat, aes(x = age)) +
  geom_histogram(binwidth = 1,
                 fill = "white",
                 colour = "black") +
  scale_y_continuous(
    limits = c(0, 11),
    expand = (c(0, 0))
  ) +
  theme_minimal_hgrid(
    font_size = 11,
    line_size = .3
  ) +
  theme(
      axis.line.x.bottom = element_line(size = .3, color = "black"),
      axis.ticks.x = element_blank(),
      axis.ticks.y = element_blank(),
      panel.grid = element_line(linetype = "dashed"),
      )

# transforming data: from wide to long
# Step 1
long <- pivot_longer(
  data = dat,
  cols = rt_word:acc_nonword,
  names_to = "dv_condition",
  values_to = "dv"
)
long
id age language dv_condition dv
S001 22 monolingual rt_word 379.4585
S001 22 monolingual rt_nonword 516.8176
S001 22 monolingual acc_word 99.0000
S001 22 monolingual acc_nonword 90.0000
S002 33 monolingual rt_word 312.4513
S002 33 monolingual rt_nonword 435.0404
S002 33 monolingual acc_word 94.0000
S002 33 monolingual acc_nonword 82.0000
S003 23 monolingual rt_word 404.9407
S003 23 monolingual rt_nonword 458.5022
S003 23 monolingual acc_word 96.0000
S003 23 monolingual acc_nonword 87.0000
S004 28 monolingual rt_word 298.3734
S004 28 monolingual rt_nonword 335.8933
S004 28 monolingual acc_word 92.0000
S004 28 monolingual acc_nonword 76.0000
S005 26 monolingual rt_word 316.4250
S005 26 monolingual rt_nonword 401.3214
S005 26 monolingual acc_word 91.0000
S005 26 monolingual acc_nonword 83.0000
S006 29 monolingual rt_word 357.1710
S006 29 monolingual rt_nonword 367.3355
S006 29 monolingual acc_word 96.0000
S006 29 monolingual acc_nonword 78.0000
S007 20 monolingual rt_word 372.9137
S007 20 monolingual rt_nonword 434.7055
S007 20 monolingual acc_word 95.0000
S007 20 monolingual acc_nonword 86.0000
S008 30 monolingual rt_word 326.9963
S008 30 monolingual rt_nonword 424.7618
S008 30 monolingual acc_word 91.0000
S008 30 monolingual acc_nonword 80.0000
S009 26 monolingual rt_word 305.8424
S009 26 monolingual rt_nonword 454.6146
S009 26 monolingual acc_word 94.0000
S009 26 monolingual acc_nonword 86.0000
S010 22 monolingual rt_word 317.3134
S010 22 monolingual rt_nonword 414.4109
S010 22 monolingual acc_word 94.0000
S010 22 monolingual acc_nonword 88.0000
S011 48 monolingual rt_word 449.6311
S011 48 monolingual rt_nonword 494.5900
S011 48 monolingual acc_word 95.0000
S011 48 monolingual acc_nonword 83.0000
S012 21 monolingual rt_word 324.6647
S012 21 monolingual rt_nonword 396.7591
S012 21 monolingual acc_word 95.0000
S012 21 monolingual acc_nonword 84.0000
S013 31 monolingual rt_word 330.5515
S013 31 monolingual rt_nonword 466.4763
S013 31 monolingual acc_word 92.0000
S013 31 monolingual acc_nonword 82.0000
S014 26 monolingual rt_word 355.5141
S014 26 monolingual rt_nonword 511.6196
S014 26 monolingual acc_word 98.0000
S014 26 monolingual acc_nonword 89.0000
S015 25 monolingual rt_word 408.7605
S015 25 monolingual rt_nonword 487.3767
S015 25 monolingual acc_word 97.0000
S015 25 monolingual acc_nonword 84.0000
S016 24 monolingual rt_word 265.9044
S016 24 monolingual rt_nonword 373.2863
S016 24 monolingual acc_word 89.0000
S016 24 monolingual acc_nonword 83.0000
S017 49 monolingual rt_word 411.6540
S017 49 monolingual rt_nonword 436.6086
S017 49 monolingual acc_word 95.0000
S017 49 monolingual acc_nonword 87.0000
S018 23 monolingual rt_word 393.3934
S018 23 monolingual rt_nonword 494.6254
S018 23 monolingual acc_word 94.0000
S018 23 monolingual acc_nonword 84.0000
S019 36 monolingual rt_word 301.5985
S019 36 monolingual rt_nonword 521.4523
S019 36 monolingual acc_word 95.0000
S019 36 monolingual acc_nonword 83.0000
S020 23 monolingual rt_word 372.8018
S020 23 monolingual rt_nonword 508.8880
S020 23 monolingual acc_word 96.0000
S020 23 monolingual acc_nonword 88.0000
S021 25 monolingual rt_word 310.7352
S021 25 monolingual rt_nonword 327.2975
S021 25 monolingual acc_word 92.0000
S021 25 monolingual acc_nonword 79.0000
S022 35 monolingual rt_word 334.6046
S022 35 monolingual rt_nonword 471.5940
S022 35 monolingual acc_word 96.0000
S022 35 monolingual acc_nonword 80.0000
S023 24 monolingual rt_word 356.2256
S023 24 monolingual rt_nonword 488.7731
S023 24 monolingual acc_word 97.0000
S023 24 monolingual acc_nonword 87.0000
S024 31 monolingual rt_word 421.9908
S024 31 monolingual rt_nonword 428.9551
S024 31 monolingual acc_word 99.0000
S024 31 monolingual acc_nonword 84.0000
S025 26 monolingual rt_word 344.0360
S025 26 monolingual rt_nonword 501.9067
S025 26 monolingual acc_word 93.0000
S025 26 monolingual acc_nonword 86.0000
S026 19 monolingual rt_word 332.5621
S026 19 monolingual rt_nonword 418.2602
S026 19 monolingual acc_word 95.0000
S026 19 monolingual acc_nonword 85.0000
S027 30 monolingual rt_word 361.3947
S027 30 monolingual rt_nonword 388.8271
S027 30 monolingual acc_word 96.0000
S027 30 monolingual acc_nonword 81.0000
S028 42 monolingual rt_word 352.1527
S028 42 monolingual rt_nonword 411.6460
S028 42 monolingual acc_word 94.0000
S028 42 monolingual acc_nonword 84.0000
S029 35 monolingual rt_word 368.5695
S029 35 monolingual rt_nonword 457.5472
S029 35 monolingual acc_word 95.0000
S029 35 monolingual acc_nonword 86.0000
S030 24 monolingual rt_word 361.4795
S030 24 monolingual rt_nonword 415.3485
S030 24 monolingual acc_word 95.0000
S030 24 monolingual acc_nonword 86.0000
S031 21 monolingual rt_word 315.9988
S031 21 monolingual rt_nonword 461.4133
S031 21 monolingual acc_word 94.0000
S031 21 monolingual acc_nonword 88.0000
S032 19 monolingual rt_word 329.2842
S032 19 monolingual rt_nonword 380.7887
S032 19 monolingual acc_word 96.0000
S032 19 monolingual acc_nonword 88.0000
S033 24 monolingual rt_word 317.4919
S033 24 monolingual rt_nonword 356.1347
S033 24 monolingual acc_word 91.0000
S033 24 monolingual acc_nonword 79.0000
S034 19 monolingual rt_word 336.3335
S034 19 monolingual rt_nonword 422.6830
S034 19 monolingual acc_word 95.0000
S034 19 monolingual acc_nonword 87.0000
S035 41 monolingual rt_word 353.7729
S035 41 monolingual rt_nonword 509.4866
S035 41 monolingual acc_word 95.0000
S035 41 monolingual acc_nonword 90.0000
S036 25 monolingual rt_word 374.7791
S036 25 monolingual rt_nonword 455.0422
S036 25 monolingual acc_word 96.0000
S036 25 monolingual acc_nonword 86.0000
S037 30 monolingual rt_word 372.0309
S037 30 monolingual rt_nonword 432.4536
S037 30 monolingual acc_word 93.0000
S037 30 monolingual acc_nonword 81.0000
S038 37 monolingual rt_word 350.4335
S038 37 monolingual rt_nonword 486.7266
S038 37 monolingual acc_word 93.0000
S038 37 monolingual acc_nonword 86.0000
S039 20 monolingual rt_word 351.0863
S039 20 monolingual rt_nonword 480.4578
S039 20 monolingual acc_word 93.0000
S039 20 monolingual acc_nonword 82.0000
S040 28 monolingual rt_word 381.3349
S040 28 monolingual rt_nonword 441.9900
S040 28 monolingual acc_word 94.0000
S040 28 monolingual acc_nonword 85.0000
S041 29 monolingual rt_word 400.9932
S041 29 monolingual rt_nonword 464.0155
S041 29 monolingual acc_word 97.0000
S041 29 monolingual acc_nonword 86.0000
S042 29 monolingual rt_word 387.3436
S042 29 monolingual rt_nonword 422.9989
S042 29 monolingual acc_word 95.0000
S042 29 monolingual acc_nonword 81.0000
S043 28 monolingual rt_word 361.3111
S043 28 monolingual rt_nonword 444.7627
S043 28 monolingual acc_word 93.0000
S043 28 monolingual acc_nonword 81.0000
S044 27 monolingual rt_word 320.5438
S044 27 monolingual rt_nonword 430.1967
S044 27 monolingual acc_word 95.0000
S044 27 monolingual acc_nonword 88.0000
S045 37 monolingual rt_word 421.4205
S045 37 monolingual rt_nonword 410.3928
S045 37 monolingual acc_word 98.0000
S045 37 monolingual acc_nonword 84.0000
S046 24 monolingual rt_word 385.1441
S046 24 monolingual rt_nonword 437.9147
S046 24 monolingual acc_word 97.0000
S046 24 monolingual acc_nonword 88.0000
S047 19 monolingual rt_word 364.2024
S047 19 monolingual rt_nonword 521.5333
S047 19 monolingual acc_word 98.0000
S047 19 monolingual acc_nonword 92.0000
S048 29 monolingual rt_word 453.8601
S048 29 monolingual rt_nonword 527.3333
S048 29 monolingual acc_word 96.0000
S048 29 monolingual acc_nonword 93.0000
S049 29 monolingual rt_word 349.3883
S049 29 monolingual rt_nonword 440.0325
S049 29 monolingual acc_word 95.0000
S049 29 monolingual acc_nonword 82.0000
S050 22 monolingual rt_word 346.8406
S050 22 monolingual rt_nonword 460.0266
S050 22 monolingual acc_word 96.0000
S050 22 monolingual acc_nonword 86.0000
S051 31 monolingual rt_word 359.2907
S051 31 monolingual rt_nonword 439.0429
S051 31 monolingual acc_word 93.0000
S051 31 monolingual acc_nonword 82.0000
S052 32 monolingual rt_word 366.5222
S052 32 monolingual rt_nonword 433.3109
S052 32 monolingual acc_word 92.0000
S052 32 monolingual acc_nonword 85.0000
S053 26 monolingual rt_word 376.8269
S053 26 monolingual rt_nonword 504.6637
S053 26 monolingual acc_word 97.0000
S053 26 monolingual acc_nonword 88.0000
S054 30 monolingual rt_word 425.8213
S054 30 monolingual rt_nonword 569.0931
S054 30 monolingual acc_word 97.0000
S054 30 monolingual acc_nonword 90.0000
S055 26 monolingual rt_word 344.1231
S055 26 monolingual rt_nonword 447.2772
S055 26 monolingual acc_word 99.0000
S055 26 monolingual acc_nonword 89.0000
S056 28 bilingual rt_word 323.3023
S056 28 bilingual rt_nonword 593.1752
S056 28 bilingual acc_word 94.0000
S056 28 bilingual acc_nonword 82.0000
S057 20 bilingual rt_word 303.0015
S057 20 bilingual rt_nonword 557.7900
S057 20 bilingual acc_word 92.0000
S057 20 bilingual acc_nonword 83.0000
S058 30 bilingual rt_word 306.8853
S058 30 bilingual rt_nonword 598.3902
S058 30 bilingual acc_word 93.0000
S058 30 bilingual acc_nonword 85.0000
S059 42 bilingual rt_word 361.1029
S059 42 bilingual rt_nonword 580.4153
S059 42 bilingual acc_word 97.0000
S059 42 bilingual acc_nonword 83.0000
S060 22 bilingual rt_word 354.6346
S060 22 bilingual rt_nonword 601.6694
S060 22 bilingual acc_word 95.0000
S060 22 bilingual acc_nonword 82.0000
S061 19 bilingual rt_word 308.6390
S061 19 bilingual rt_nonword 642.5463
S061 19 bilingual acc_word 95.0000
S061 19 bilingual acc_nonword 88.0000
S062 33 bilingual rt_word 479.6013
S062 33 bilingual rt_nonword 706.2317
S062 33 bilingual acc_word 97.0000
S062 33 bilingual acc_nonword 82.0000
S063 25 bilingual rt_word 285.6374
S063 25 bilingual rt_nonword 628.6810
S063 25 bilingual acc_word 96.0000
S063 25 bilingual acc_nonword 83.0000
S064 21 bilingual rt_word 318.2438
S064 21 bilingual rt_nonword 525.0792
S064 21 bilingual acc_word 94.0000
S064 21 bilingual acc_nonword 88.0000
S065 19 bilingual rt_word 256.2833
S065 19 bilingual rt_nonword 555.9724
S065 19 bilingual acc_word 94.0000
S065 19 bilingual acc_nonword 86.0000
S066 49 bilingual rt_word 359.9618
S066 49 bilingual rt_nonword 634.2251
S066 49 bilingual acc_word 96.0000
S066 49 bilingual acc_nonword 85.0000
S067 42 bilingual rt_word 305.0576
S067 42 bilingual rt_nonword 529.1660
S067 42 bilingual acc_word 92.0000
S067 42 bilingual acc_nonword 84.0000
S068 24 bilingual rt_word 379.6834
S068 24 bilingual rt_nonword 647.4263
S068 24 bilingual acc_word 97.0000
S068 24 bilingual acc_nonword 87.0000
S069 45 bilingual rt_word 412.4538
S069 45 bilingual rt_nonword 672.0647
S069 45 bilingual acc_word 96.0000
S069 45 bilingual acc_nonword 91.0000
S070 34 bilingual rt_word 338.3924
S070 34 bilingual rt_nonword 492.8611
S070 34 bilingual acc_word 92.0000
S070 34 bilingual acc_nonword 81.0000
S071 30 bilingual rt_word 421.6405
S071 30 bilingual rt_nonword 581.0442
S071 30 bilingual acc_word 99.0000
S071 30 bilingual acc_nonword 90.0000
S072 58 bilingual rt_word 355.7949
S072 58 bilingual rt_nonword 589.4649
S072 58 bilingual acc_word 94.0000
S072 58 bilingual acc_nonword 83.0000
S073 35 bilingual rt_word 397.7493
S073 35 bilingual rt_nonword 649.2890
S073 35 bilingual acc_word 96.0000
S073 35 bilingual acc_nonword 88.0000
S074 29 bilingual rt_word 296.7600
S074 29 bilingual rt_nonword 530.2701
S074 29 bilingual acc_word 91.0000
S074 29 bilingual acc_nonword 80.0000
S075 25 bilingual rt_word 283.7838
S075 25 bilingual rt_nonword 569.7511
S075 25 bilingual acc_word 93.0000
S075 25 bilingual acc_nonword 81.0000
S076 30 bilingual rt_word 324.8029
S076 30 bilingual rt_nonword 583.1495
S076 30 bilingual acc_word 94.0000
S076 30 bilingual acc_nonword 83.0000
S077 33 bilingual rt_word 293.4677
S077 33 bilingual rt_nonword 662.9756
S077 33 bilingual acc_word 94.0000
S077 33 bilingual acc_nonword 85.0000
S078 36 bilingual rt_word 363.7906
S078 36 bilingual rt_nonword 649.4659
S078 36 bilingual acc_word 97.0000
S078 36 bilingual acc_nonword 87.0000
S079 26 bilingual rt_word 293.7165
S079 26 bilingual rt_nonword 567.1610
S079 26 bilingual acc_word 95.0000
S079 26 bilingual acc_nonword 87.0000
S080 40 bilingual rt_word 427.3684
S080 40 bilingual rt_nonword 666.4698
S080 40 bilingual acc_word 99.0000
S080 40 bilingual acc_nonword 84.0000
S081 39 bilingual rt_word 377.6361
S081 39 bilingual rt_nonword 651.8712
S081 39 bilingual acc_word 95.0000
S081 39 bilingual acc_nonword 86.0000
S082 41 bilingual rt_word 451.2491
S082 41 bilingual rt_nonword 617.4923
S082 41 bilingual acc_word 100.0000
S082 41 bilingual acc_nonword 92.0000
S083 33 bilingual rt_word 306.6871
S083 33 bilingual rt_nonword 563.0389
S083 33 bilingual acc_word 95.0000
S083 33 bilingual acc_nonword 87.0000
S084 25 bilingual rt_word 356.0584
S084 25 bilingual rt_nonword 658.6612
S084 25 bilingual acc_word 96.0000
S084 25 bilingual acc_nonword 92.0000
S085 34 bilingual rt_word 412.7244
S085 34 bilingual rt_nonword 626.4060
S085 34 bilingual acc_word 97.0000
S085 34 bilingual acc_nonword 87.0000
S086 30 bilingual rt_word 344.9887
S086 30 bilingual rt_nonword 633.7884
S086 30 bilingual acc_word 98.0000
S086 30 bilingual acc_nonword 91.0000
S087 27 bilingual rt_word 329.0913
S087 27 bilingual rt_nonword 635.9451
S087 27 bilingual acc_word 94.0000
S087 27 bilingual acc_nonword 89.0000
S088 22 bilingual rt_word 383.4406
S088 22 bilingual rt_nonword 582.7703
S088 22 bilingual acc_word 97.0000
S088 22 bilingual acc_nonword 88.0000
S089 27 bilingual rt_word 294.6573
S089 27 bilingual rt_nonword 555.2142
S089 27 bilingual acc_word 93.0000
S089 27 bilingual acc_nonword 80.0000
S090 27 bilingual rt_word 386.6881
S090 27 bilingual rt_nonword 644.5688
S090 27 bilingual acc_word 94.0000
S090 27 bilingual acc_nonword 88.0000
S091 38 bilingual rt_word 333.9251
S091 38 bilingual rt_nonword 570.6841
S091 38 bilingual acc_word 96.0000
S091 38 bilingual acc_nonword 83.0000
S092 30 bilingual rt_word 381.4393
S092 30 bilingual rt_nonword 579.6419
S092 30 bilingual acc_word 98.0000
S092 30 bilingual acc_nonword 83.0000
S093 51 bilingual rt_word 415.8978
S093 51 bilingual rt_nonword 692.2597
S093 51 bilingual acc_word 97.0000
S093 51 bilingual acc_nonword 90.0000
S094 18 bilingual rt_word 327.5815
S094 18 bilingual rt_nonword 576.2896
S094 18 bilingual acc_word 94.0000
S094 18 bilingual acc_nonword 84.0000
S095 30 bilingual rt_word 353.7837
S095 30 bilingual rt_nonword 592.7136
S095 30 bilingual acc_word 96.0000
S095 30 bilingual acc_nonword 83.0000
S096 50 bilingual rt_word 331.0409
S096 50 bilingual rt_nonword 530.9726
S096 50 bilingual acc_word 94.0000
S096 50 bilingual acc_nonword 77.0000
S097 22 bilingual rt_word 370.4969
S097 22 bilingual rt_nonword 555.9138
S097 22 bilingual acc_word 97.0000
S097 22 bilingual acc_nonword 83.0000
S098 29 bilingual rt_word 331.1488
S098 29 bilingual rt_nonword 532.2910
S098 29 bilingual acc_word 93.0000
S098 29 bilingual acc_nonword 77.0000
S099 26 bilingual rt_word 274.5491
S099 26 bilingual rt_nonword 536.6449
S099 26 bilingual acc_word 92.0000
S099 26 bilingual acc_nonword 81.0000
S100 43 bilingual rt_word 351.2182
S100 43 bilingual rt_nonword 601.3442
S100 43 bilingual acc_word 95.0000
S100 43 bilingual acc_nonword 83.0000
# Step 2
long2 <- pivot_longer(
  data = dat,
  cols = rt_word:acc_nonword,
  names_sep = "_",
  names_to = c("dv_type", "condition"),
  values_to = "dv"
)
long2
id age language dv_type condition dv
S001 22 monolingual rt word 379.4585
S001 22 monolingual rt nonword 516.8176
S001 22 monolingual acc word 99.0000
S001 22 monolingual acc nonword 90.0000
S002 33 monolingual rt word 312.4513
S002 33 monolingual rt nonword 435.0404
S002 33 monolingual acc word 94.0000
S002 33 monolingual acc nonword 82.0000
S003 23 monolingual rt word 404.9407
S003 23 monolingual rt nonword 458.5022
S003 23 monolingual acc word 96.0000
S003 23 monolingual acc nonword 87.0000
S004 28 monolingual rt word 298.3734
S004 28 monolingual rt nonword 335.8933
S004 28 monolingual acc word 92.0000
S004 28 monolingual acc nonword 76.0000
S005 26 monolingual rt word 316.4250
S005 26 monolingual rt nonword 401.3214
S005 26 monolingual acc word 91.0000
S005 26 monolingual acc nonword 83.0000
S006 29 monolingual rt word 357.1710
S006 29 monolingual rt nonword 367.3355
S006 29 monolingual acc word 96.0000
S006 29 monolingual acc nonword 78.0000
S007 20 monolingual rt word 372.9137
S007 20 monolingual rt nonword 434.7055
S007 20 monolingual acc word 95.0000
S007 20 monolingual acc nonword 86.0000
S008 30 monolingual rt word 326.9963
S008 30 monolingual rt nonword 424.7618
S008 30 monolingual acc word 91.0000
S008 30 monolingual acc nonword 80.0000
S009 26 monolingual rt word 305.8424
S009 26 monolingual rt nonword 454.6146
S009 26 monolingual acc word 94.0000
S009 26 monolingual acc nonword 86.0000
S010 22 monolingual rt word 317.3134
S010 22 monolingual rt nonword 414.4109
S010 22 monolingual acc word 94.0000
S010 22 monolingual acc nonword 88.0000
S011 48 monolingual rt word 449.6311
S011 48 monolingual rt nonword 494.5900
S011 48 monolingual acc word 95.0000
S011 48 monolingual acc nonword 83.0000
S012 21 monolingual rt word 324.6647
S012 21 monolingual rt nonword 396.7591
S012 21 monolingual acc word 95.0000
S012 21 monolingual acc nonword 84.0000
S013 31 monolingual rt word 330.5515
S013 31 monolingual rt nonword 466.4763
S013 31 monolingual acc word 92.0000
S013 31 monolingual acc nonword 82.0000
S014 26 monolingual rt word 355.5141
S014 26 monolingual rt nonword 511.6196
S014 26 monolingual acc word 98.0000
S014 26 monolingual acc nonword 89.0000
S015 25 monolingual rt word 408.7605
S015 25 monolingual rt nonword 487.3767
S015 25 monolingual acc word 97.0000
S015 25 monolingual acc nonword 84.0000
S016 24 monolingual rt word 265.9044
S016 24 monolingual rt nonword 373.2863
S016 24 monolingual acc word 89.0000
S016 24 monolingual acc nonword 83.0000
S017 49 monolingual rt word 411.6540
S017 49 monolingual rt nonword 436.6086
S017 49 monolingual acc word 95.0000
S017 49 monolingual acc nonword 87.0000
S018 23 monolingual rt word 393.3934
S018 23 monolingual rt nonword 494.6254
S018 23 monolingual acc word 94.0000
S018 23 monolingual acc nonword 84.0000
S019 36 monolingual rt word 301.5985
S019 36 monolingual rt nonword 521.4523
S019 36 monolingual acc word 95.0000
S019 36 monolingual acc nonword 83.0000
S020 23 monolingual rt word 372.8018
S020 23 monolingual rt nonword 508.8880
S020 23 monolingual acc word 96.0000
S020 23 monolingual acc nonword 88.0000
S021 25 monolingual rt word 310.7352
S021 25 monolingual rt nonword 327.2975
S021 25 monolingual acc word 92.0000
S021 25 monolingual acc nonword 79.0000
S022 35 monolingual rt word 334.6046
S022 35 monolingual rt nonword 471.5940
S022 35 monolingual acc word 96.0000
S022 35 monolingual acc nonword 80.0000
S023 24 monolingual rt word 356.2256
S023 24 monolingual rt nonword 488.7731
S023 24 monolingual acc word 97.0000
S023 24 monolingual acc nonword 87.0000
S024 31 monolingual rt word 421.9908
S024 31 monolingual rt nonword 428.9551
S024 31 monolingual acc word 99.0000
S024 31 monolingual acc nonword 84.0000
S025 26 monolingual rt word 344.0360
S025 26 monolingual rt nonword 501.9067
S025 26 monolingual acc word 93.0000
S025 26 monolingual acc nonword 86.0000
S026 19 monolingual rt word 332.5621
S026 19 monolingual rt nonword 418.2602
S026 19 monolingual acc word 95.0000
S026 19 monolingual acc nonword 85.0000
S027 30 monolingual rt word 361.3947
S027 30 monolingual rt nonword 388.8271
S027 30 monolingual acc word 96.0000
S027 30 monolingual acc nonword 81.0000
S028 42 monolingual rt word 352.1527
S028 42 monolingual rt nonword 411.6460
S028 42 monolingual acc word 94.0000
S028 42 monolingual acc nonword 84.0000
S029 35 monolingual rt word 368.5695
S029 35 monolingual rt nonword 457.5472
S029 35 monolingual acc word 95.0000
S029 35 monolingual acc nonword 86.0000
S030 24 monolingual rt word 361.4795
S030 24 monolingual rt nonword 415.3485
S030 24 monolingual acc word 95.0000
S030 24 monolingual acc nonword 86.0000
S031 21 monolingual rt word 315.9988
S031 21 monolingual rt nonword 461.4133
S031 21 monolingual acc word 94.0000
S031 21 monolingual acc nonword 88.0000
S032 19 monolingual rt word 329.2842
S032 19 monolingual rt nonword 380.7887
S032 19 monolingual acc word 96.0000
S032 19 monolingual acc nonword 88.0000
S033 24 monolingual rt word 317.4919
S033 24 monolingual rt nonword 356.1347
S033 24 monolingual acc word 91.0000
S033 24 monolingual acc nonword 79.0000
S034 19 monolingual rt word 336.3335
S034 19 monolingual rt nonword 422.6830
S034 19 monolingual acc word 95.0000
S034 19 monolingual acc nonword 87.0000
S035 41 monolingual rt word 353.7729
S035 41 monolingual rt nonword 509.4866
S035 41 monolingual acc word 95.0000
S035 41 monolingual acc nonword 90.0000
S036 25 monolingual rt word 374.7791
S036 25 monolingual rt nonword 455.0422
S036 25 monolingual acc word 96.0000
S036 25 monolingual acc nonword 86.0000
S037 30 monolingual rt word 372.0309
S037 30 monolingual rt nonword 432.4536
S037 30 monolingual acc word 93.0000
S037 30 monolingual acc nonword 81.0000
S038 37 monolingual rt word 350.4335
S038 37 monolingual rt nonword 486.7266
S038 37 monolingual acc word 93.0000
S038 37 monolingual acc nonword 86.0000
S039 20 monolingual rt word 351.0863
S039 20 monolingual rt nonword 480.4578
S039 20 monolingual acc word 93.0000
S039 20 monolingual acc nonword 82.0000
S040 28 monolingual rt word 381.3349
S040 28 monolingual rt nonword 441.9900
S040 28 monolingual acc word 94.0000
S040 28 monolingual acc nonword 85.0000
S041 29 monolingual rt word 400.9932
S041 29 monolingual rt nonword 464.0155
S041 29 monolingual acc word 97.0000
S041 29 monolingual acc nonword 86.0000
S042 29 monolingual rt word 387.3436
S042 29 monolingual rt nonword 422.9989
S042 29 monolingual acc word 95.0000
S042 29 monolingual acc nonword 81.0000
S043 28 monolingual rt word 361.3111
S043 28 monolingual rt nonword 444.7627
S043 28 monolingual acc word 93.0000
S043 28 monolingual acc nonword 81.0000
S044 27 monolingual rt word 320.5438
S044 27 monolingual rt nonword 430.1967
S044 27 monolingual acc word 95.0000
S044 27 monolingual acc nonword 88.0000
S045 37 monolingual rt word 421.4205
S045 37 monolingual rt nonword 410.3928
S045 37 monolingual acc word 98.0000
S045 37 monolingual acc nonword 84.0000
S046 24 monolingual rt word 385.1441
S046 24 monolingual rt nonword 437.9147
S046 24 monolingual acc word 97.0000
S046 24 monolingual acc nonword 88.0000
S047 19 monolingual rt word 364.2024
S047 19 monolingual rt nonword 521.5333
S047 19 monolingual acc word 98.0000
S047 19 monolingual acc nonword 92.0000
S048 29 monolingual rt word 453.8601
S048 29 monolingual rt nonword 527.3333
S048 29 monolingual acc word 96.0000
S048 29 monolingual acc nonword 93.0000
S049 29 monolingual rt word 349.3883
S049 29 monolingual rt nonword 440.0325
S049 29 monolingual acc word 95.0000
S049 29 monolingual acc nonword 82.0000
S050 22 monolingual rt word 346.8406
S050 22 monolingual rt nonword 460.0266
S050 22 monolingual acc word 96.0000
S050 22 monolingual acc nonword 86.0000
S051 31 monolingual rt word 359.2907
S051 31 monolingual rt nonword 439.0429
S051 31 monolingual acc word 93.0000
S051 31 monolingual acc nonword 82.0000
S052 32 monolingual rt word 366.5222
S052 32 monolingual rt nonword 433.3109
S052 32 monolingual acc word 92.0000
S052 32 monolingual acc nonword 85.0000
S053 26 monolingual rt word 376.8269
S053 26 monolingual rt nonword 504.6637
S053 26 monolingual acc word 97.0000
S053 26 monolingual acc nonword 88.0000
S054 30 monolingual rt word 425.8213
S054 30 monolingual rt nonword 569.0931
S054 30 monolingual acc word 97.0000
S054 30 monolingual acc nonword 90.0000
S055 26 monolingual rt word 344.1231
S055 26 monolingual rt nonword 447.2772
S055 26 monolingual acc word 99.0000
S055 26 monolingual acc nonword 89.0000
S056 28 bilingual rt word 323.3023
S056 28 bilingual rt nonword 593.1752
S056 28 bilingual acc word 94.0000
S056 28 bilingual acc nonword 82.0000
S057 20 bilingual rt word 303.0015
S057 20 bilingual rt nonword 557.7900
S057 20 bilingual acc word 92.0000
S057 20 bilingual acc nonword 83.0000
S058 30 bilingual rt word 306.8853
S058 30 bilingual rt nonword 598.3902
S058 30 bilingual acc word 93.0000
S058 30 bilingual acc nonword 85.0000
S059 42 bilingual rt word 361.1029
S059 42 bilingual rt nonword 580.4153
S059 42 bilingual acc word 97.0000
S059 42 bilingual acc nonword 83.0000
S060 22 bilingual rt word 354.6346
S060 22 bilingual rt nonword 601.6694
S060 22 bilingual acc word 95.0000
S060 22 bilingual acc nonword 82.0000
S061 19 bilingual rt word 308.6390
S061 19 bilingual rt nonword 642.5463
S061 19 bilingual acc word 95.0000
S061 19 bilingual acc nonword 88.0000
S062 33 bilingual rt word 479.6013
S062 33 bilingual rt nonword 706.2317
S062 33 bilingual acc word 97.0000
S062 33 bilingual acc nonword 82.0000
S063 25 bilingual rt word 285.6374
S063 25 bilingual rt nonword 628.6810
S063 25 bilingual acc word 96.0000
S063 25 bilingual acc nonword 83.0000
S064 21 bilingual rt word 318.2438
S064 21 bilingual rt nonword 525.0792
S064 21 bilingual acc word 94.0000
S064 21 bilingual acc nonword 88.0000
S065 19 bilingual rt word 256.2833
S065 19 bilingual rt nonword 555.9724
S065 19 bilingual acc word 94.0000
S065 19 bilingual acc nonword 86.0000
S066 49 bilingual rt word 359.9618
S066 49 bilingual rt nonword 634.2251
S066 49 bilingual acc word 96.0000
S066 49 bilingual acc nonword 85.0000
S067 42 bilingual rt word 305.0576
S067 42 bilingual rt nonword 529.1660
S067 42 bilingual acc word 92.0000
S067 42 bilingual acc nonword 84.0000
S068 24 bilingual rt word 379.6834
S068 24 bilingual rt nonword 647.4263
S068 24 bilingual acc word 97.0000
S068 24 bilingual acc nonword 87.0000
S069 45 bilingual rt word 412.4538
S069 45 bilingual rt nonword 672.0647
S069 45 bilingual acc word 96.0000
S069 45 bilingual acc nonword 91.0000
S070 34 bilingual rt word 338.3924
S070 34 bilingual rt nonword 492.8611
S070 34 bilingual acc word 92.0000
S070 34 bilingual acc nonword 81.0000
S071 30 bilingual rt word 421.6405
S071 30 bilingual rt nonword 581.0442
S071 30 bilingual acc word 99.0000
S071 30 bilingual acc nonword 90.0000
S072 58 bilingual rt word 355.7949
S072 58 bilingual rt nonword 589.4649
S072 58 bilingual acc word 94.0000
S072 58 bilingual acc nonword 83.0000
S073 35 bilingual rt word 397.7493
S073 35 bilingual rt nonword 649.2890
S073 35 bilingual acc word 96.0000
S073 35 bilingual acc nonword 88.0000
S074 29 bilingual rt word 296.7600
S074 29 bilingual rt nonword 530.2701
S074 29 bilingual acc word 91.0000
S074 29 bilingual acc nonword 80.0000
S075 25 bilingual rt word 283.7838
S075 25 bilingual rt nonword 569.7511
S075 25 bilingual acc word 93.0000
S075 25 bilingual acc nonword 81.0000
S076 30 bilingual rt word 324.8029
S076 30 bilingual rt nonword 583.1495
S076 30 bilingual acc word 94.0000
S076 30 bilingual acc nonword 83.0000
S077 33 bilingual rt word 293.4677
S077 33 bilingual rt nonword 662.9756
S077 33 bilingual acc word 94.0000
S077 33 bilingual acc nonword 85.0000
S078 36 bilingual rt word 363.7906
S078 36 bilingual rt nonword 649.4659
S078 36 bilingual acc word 97.0000
S078 36 bilingual acc nonword 87.0000
S079 26 bilingual rt word 293.7165
S079 26 bilingual rt nonword 567.1610
S079 26 bilingual acc word 95.0000
S079 26 bilingual acc nonword 87.0000
S080 40 bilingual rt word 427.3684
S080 40 bilingual rt nonword 666.4698
S080 40 bilingual acc word 99.0000
S080 40 bilingual acc nonword 84.0000
S081 39 bilingual rt word 377.6361
S081 39 bilingual rt nonword 651.8712
S081 39 bilingual acc word 95.0000
S081 39 bilingual acc nonword 86.0000
S082 41 bilingual rt word 451.2491
S082 41 bilingual rt nonword 617.4923
S082 41 bilingual acc word 100.0000
S082 41 bilingual acc nonword 92.0000
S083 33 bilingual rt word 306.6871
S083 33 bilingual rt nonword 563.0389
S083 33 bilingual acc word 95.0000
S083 33 bilingual acc nonword 87.0000
S084 25 bilingual rt word 356.0584
S084 25 bilingual rt nonword 658.6612
S084 25 bilingual acc word 96.0000
S084 25 bilingual acc nonword 92.0000
S085 34 bilingual rt word 412.7244
S085 34 bilingual rt nonword 626.4060
S085 34 bilingual acc word 97.0000
S085 34 bilingual acc nonword 87.0000
S086 30 bilingual rt word 344.9887
S086 30 bilingual rt nonword 633.7884
S086 30 bilingual acc word 98.0000
S086 30 bilingual acc nonword 91.0000
S087 27 bilingual rt word 329.0913
S087 27 bilingual rt nonword 635.9451
S087 27 bilingual acc word 94.0000
S087 27 bilingual acc nonword 89.0000
S088 22 bilingual rt word 383.4406
S088 22 bilingual rt nonword 582.7703
S088 22 bilingual acc word 97.0000
S088 22 bilingual acc nonword 88.0000
S089 27 bilingual rt word 294.6573
S089 27 bilingual rt nonword 555.2142
S089 27 bilingual acc word 93.0000
S089 27 bilingual acc nonword 80.0000
S090 27 bilingual rt word 386.6881
S090 27 bilingual rt nonword 644.5688
S090 27 bilingual acc word 94.0000
S090 27 bilingual acc nonword 88.0000
S091 38 bilingual rt word 333.9251
S091 38 bilingual rt nonword 570.6841
S091 38 bilingual acc word 96.0000
S091 38 bilingual acc nonword 83.0000
S092 30 bilingual rt word 381.4393
S092 30 bilingual rt nonword 579.6419
S092 30 bilingual acc word 98.0000
S092 30 bilingual acc nonword 83.0000
S093 51 bilingual rt word 415.8978
S093 51 bilingual rt nonword 692.2597
S093 51 bilingual acc word 97.0000
S093 51 bilingual acc nonword 90.0000
S094 18 bilingual rt word 327.5815
S094 18 bilingual rt nonword 576.2896
S094 18 bilingual acc word 94.0000
S094 18 bilingual acc nonword 84.0000
S095 30 bilingual rt word 353.7837
S095 30 bilingual rt nonword 592.7136
S095 30 bilingual acc word 96.0000
S095 30 bilingual acc nonword 83.0000
S096 50 bilingual rt word 331.0409
S096 50 bilingual rt nonword 530.9726
S096 50 bilingual acc word 94.0000
S096 50 bilingual acc nonword 77.0000
S097 22 bilingual rt word 370.4969
S097 22 bilingual rt nonword 555.9138
S097 22 bilingual acc word 97.0000
S097 22 bilingual acc nonword 83.0000
S098 29 bilingual rt word 331.1488
S098 29 bilingual rt nonword 532.2910
S098 29 bilingual acc word 93.0000
S098 29 bilingual acc nonword 77.0000
S099 26 bilingual rt word 274.5491
S099 26 bilingual rt nonword 536.6449
S099 26 bilingual acc word 92.0000
S099 26 bilingual acc nonword 81.0000
S100 43 bilingual rt word 351.2182
S100 43 bilingual rt nonword 601.3442
S100 43 bilingual acc word 95.0000
S100 43 bilingual acc nonword 83.0000
# Step 3
dat_long <- pivot_wider(
  data = long2,
  names_from = "dv_type",
  values_from = "dv"
)
dat_long
id age language condition rt acc
S001 22 monolingual word 379.4585 99
S001 22 monolingual nonword 516.8176 90
S002 33 monolingual word 312.4513 94
S002 33 monolingual nonword 435.0404 82
S003 23 monolingual word 404.9407 96
S003 23 monolingual nonword 458.5022 87
S004 28 monolingual word 298.3734 92
S004 28 monolingual nonword 335.8933 76
S005 26 monolingual word 316.4250 91
S005 26 monolingual nonword 401.3214 83
S006 29 monolingual word 357.1710 96
S006 29 monolingual nonword 367.3355 78
S007 20 monolingual word 372.9137 95
S007 20 monolingual nonword 434.7055 86
S008 30 monolingual word 326.9963 91
S008 30 monolingual nonword 424.7618 80
S009 26 monolingual word 305.8424 94
S009 26 monolingual nonword 454.6146 86
S010 22 monolingual word 317.3134 94
S010 22 monolingual nonword 414.4109 88
S011 48 monolingual word 449.6311 95
S011 48 monolingual nonword 494.5900 83
S012 21 monolingual word 324.6647 95
S012 21 monolingual nonword 396.7591 84
S013 31 monolingual word 330.5515 92
S013 31 monolingual nonword 466.4763 82
S014 26 monolingual word 355.5141 98
S014 26 monolingual nonword 511.6196 89
S015 25 monolingual word 408.7605 97
S015 25 monolingual nonword 487.3767 84
S016 24 monolingual word 265.9044 89
S016 24 monolingual nonword 373.2863 83
S017 49 monolingual word 411.6540 95
S017 49 monolingual nonword 436.6086 87
S018 23 monolingual word 393.3934 94
S018 23 monolingual nonword 494.6254 84
S019 36 monolingual word 301.5985 95
S019 36 monolingual nonword 521.4523 83
S020 23 monolingual word 372.8018 96
S020 23 monolingual nonword 508.8880 88
S021 25 monolingual word 310.7352 92
S021 25 monolingual nonword 327.2975 79
S022 35 monolingual word 334.6046 96
S022 35 monolingual nonword 471.5940 80
S023 24 monolingual word 356.2256 97
S023 24 monolingual nonword 488.7731 87
S024 31 monolingual word 421.9908 99
S024 31 monolingual nonword 428.9551 84
S025 26 monolingual word 344.0360 93
S025 26 monolingual nonword 501.9067 86
S026 19 monolingual word 332.5621 95
S026 19 monolingual nonword 418.2602 85
S027 30 monolingual word 361.3947 96
S027 30 monolingual nonword 388.8271 81
S028 42 monolingual word 352.1527 94
S028 42 monolingual nonword 411.6460 84
S029 35 monolingual word 368.5695 95
S029 35 monolingual nonword 457.5472 86
S030 24 monolingual word 361.4795 95
S030 24 monolingual nonword 415.3485 86
S031 21 monolingual word 315.9988 94
S031 21 monolingual nonword 461.4133 88
S032 19 monolingual word 329.2842 96
S032 19 monolingual nonword 380.7887 88
S033 24 monolingual word 317.4919 91
S033 24 monolingual nonword 356.1347 79
S034 19 monolingual word 336.3335 95
S034 19 monolingual nonword 422.6830 87
S035 41 monolingual word 353.7729 95
S035 41 monolingual nonword 509.4866 90
S036 25 monolingual word 374.7791 96
S036 25 monolingual nonword 455.0422 86
S037 30 monolingual word 372.0309 93
S037 30 monolingual nonword 432.4536 81
S038 37 monolingual word 350.4335 93
S038 37 monolingual nonword 486.7266 86
S039 20 monolingual word 351.0863 93
S039 20 monolingual nonword 480.4578 82
S040 28 monolingual word 381.3349 94
S040 28 monolingual nonword 441.9900 85
S041 29 monolingual word 400.9932 97
S041 29 monolingual nonword 464.0155 86
S042 29 monolingual word 387.3436 95
S042 29 monolingual nonword 422.9989 81
S043 28 monolingual word 361.3111 93
S043 28 monolingual nonword 444.7627 81
S044 27 monolingual word 320.5438 95
S044 27 monolingual nonword 430.1967 88
S045 37 monolingual word 421.4205 98
S045 37 monolingual nonword 410.3928 84
S046 24 monolingual word 385.1441 97
S046 24 monolingual nonword 437.9147 88
S047 19 monolingual word 364.2024 98
S047 19 monolingual nonword 521.5333 92
S048 29 monolingual word 453.8601 96
S048 29 monolingual nonword 527.3333 93
S049 29 monolingual word 349.3883 95
S049 29 monolingual nonword 440.0325 82
S050 22 monolingual word 346.8406 96
S050 22 monolingual nonword 460.0266 86
S051 31 monolingual word 359.2907 93
S051 31 monolingual nonword 439.0429 82
S052 32 monolingual word 366.5222 92
S052 32 monolingual nonword 433.3109 85
S053 26 monolingual word 376.8269 97
S053 26 monolingual nonword 504.6637 88
S054 30 monolingual word 425.8213 97
S054 30 monolingual nonword 569.0931 90
S055 26 monolingual word 344.1231 99
S055 26 monolingual nonword 447.2772 89
S056 28 bilingual word 323.3023 94
S056 28 bilingual nonword 593.1752 82
S057 20 bilingual word 303.0015 92
S057 20 bilingual nonword 557.7900 83
S058 30 bilingual word 306.8853 93
S058 30 bilingual nonword 598.3902 85
S059 42 bilingual word 361.1029 97
S059 42 bilingual nonword 580.4153 83
S060 22 bilingual word 354.6346 95
S060 22 bilingual nonword 601.6694 82
S061 19 bilingual word 308.6390 95
S061 19 bilingual nonword 642.5463 88
S062 33 bilingual word 479.6013 97
S062 33 bilingual nonword 706.2317 82
S063 25 bilingual word 285.6374 96
S063 25 bilingual nonword 628.6810 83
S064 21 bilingual word 318.2438 94
S064 21 bilingual nonword 525.0792 88
S065 19 bilingual word 256.2833 94
S065 19 bilingual nonword 555.9724 86
S066 49 bilingual word 359.9618 96
S066 49 bilingual nonword 634.2251 85
S067 42 bilingual word 305.0576 92
S067 42 bilingual nonword 529.1660 84
S068 24 bilingual word 379.6834 97
S068 24 bilingual nonword 647.4263 87
S069 45 bilingual word 412.4538 96
S069 45 bilingual nonword 672.0647 91
S070 34 bilingual word 338.3924 92
S070 34 bilingual nonword 492.8611 81
S071 30 bilingual word 421.6405 99
S071 30 bilingual nonword 581.0442 90
S072 58 bilingual word 355.7949 94
S072 58 bilingual nonword 589.4649 83
S073 35 bilingual word 397.7493 96
S073 35 bilingual nonword 649.2890 88
S074 29 bilingual word 296.7600 91
S074 29 bilingual nonword 530.2701 80
S075 25 bilingual word 283.7838 93
S075 25 bilingual nonword 569.7511 81
S076 30 bilingual word 324.8029 94
S076 30 bilingual nonword 583.1495 83
S077 33 bilingual word 293.4677 94
S077 33 bilingual nonword 662.9756 85
S078 36 bilingual word 363.7906 97
S078 36 bilingual nonword 649.4659 87
S079 26 bilingual word 293.7165 95
S079 26 bilingual nonword 567.1610 87
S080 40 bilingual word 427.3684 99
S080 40 bilingual nonword 666.4698 84
S081 39 bilingual word 377.6361 95
S081 39 bilingual nonword 651.8712 86
S082 41 bilingual word 451.2491 100
S082 41 bilingual nonword 617.4923 92
S083 33 bilingual word 306.6871 95
S083 33 bilingual nonword 563.0389 87
S084 25 bilingual word 356.0584 96
S084 25 bilingual nonword 658.6612 92
S085 34 bilingual word 412.7244 97
S085 34 bilingual nonword 626.4060 87
S086 30 bilingual word 344.9887 98
S086 30 bilingual nonword 633.7884 91
S087 27 bilingual word 329.0913 94
S087 27 bilingual nonword 635.9451 89
S088 22 bilingual word 383.4406 97
S088 22 bilingual nonword 582.7703 88
S089 27 bilingual word 294.6573 93
S089 27 bilingual nonword 555.2142 80
S090 27 bilingual word 386.6881 94
S090 27 bilingual nonword 644.5688 88
S091 38 bilingual word 333.9251 96
S091 38 bilingual nonword 570.6841 83
S092 30 bilingual word 381.4393 98
S092 30 bilingual nonword 579.6419 83
S093 51 bilingual word 415.8978 97
S093 51 bilingual nonword 692.2597 90
S094 18 bilingual word 327.5815 94
S094 18 bilingual nonword 576.2896 84
S095 30 bilingual word 353.7837 96
S095 30 bilingual nonword 592.7136 83
S096 50 bilingual word 331.0409 94
S096 50 bilingual nonword 530.9726 77
S097 22 bilingual word 370.4969 97
S097 22 bilingual nonword 555.9138 83
S098 29 bilingual word 331.1488 93
S098 29 bilingual nonword 532.2910 77
S099 26 bilingual word 274.5491 92
S099 26 bilingual nonword 536.6449 81
S100 43 bilingual word 351.2182 95
S100 43 bilingual nonword 601.3442 83
# The whole pipeline
# dat_long <- pivot_longer(
#   data = dat,
#   cols = rt_word:acc_nonword,
#   names_sep = "_",
#   names_to = c("dv_type", "condition"),
#   values_to = "dv"
# ) %>%
#   pivot_wider(names_from = "dv_type",
#               values_from = "dv")


# plotting rt (hist)
ggplot(dat_long, aes(x = rt)) +
  geom_histogram(binwidth = 10,
                 fill = "white",
                 colour = "black") +
  scale_y_continuous(
    name = "Reaction time (ms)",
    limits = c(0, 11),
    expand = (c(0, 0))
  ) +
  theme_minimal_hgrid(
    font_size = 11,
    line_size = .3
  ) +
  theme(
      axis.line.x.bottom = element_line(size = .3, color = "black"),
      axis.ticks.x = element_blank(),
      axis.ticks.y = element_blank(),
      panel.grid = element_line(linetype = "dashed"),
      )

# plotting accuracy (hist)
ggplot(dat_long, aes(x = acc)) +
  geom_histogram(binwidth = 1,
                 fill = "white",
                 colour = "black") +
  scale_y_continuous(
    name = "Accuracy (0-100)",
    limits = c(0, 18),
    expand = (c(0, 0))
  ) +
  theme_minimal_hgrid(
    font_size = 11,
    line_size = .3
  ) +
  theme(
      axis.line.x.bottom = element_line(size = .3, color = "black"),
      axis.ticks.x = element_blank(),
      axis.ticks.y = element_blank(),
      panel.grid = element_line(linetype = "dashed"),
      )

# density plot
p5 <- ggplot(dat_long, aes(x = rt, fill = condition)) +
  geom_density() +
  scale_y_continuous(
    name = "Reaction time (ms)",
    expand = c(0, 0)
  ) +
  scale_fill_discrete(
    name = "Condition",
    labels = c("Word", "Non-word")
  ) +
  theme_minimal_hgrid(
    font_size = 11,
    line_size = .3
  ) +
  theme(
      axis.line.x.bottom = element_line(size = .3, color = "black"),
      axis.ticks.x = element_line(color = "black"),
      axis.ticks.y = element_blank(),
      panel.grid = element_line(linetype = "dashed"),
      )

# scatterplot
ggplot(dat_long, aes(x = rt, y = age, color = condition)) +
  geom_point() +
  geom_smooth(method = "lm") +
  scale_colour_discrete(
    name = "Condition",
    labels = c("Word", "Non-word")
  ) +
  theme_cowplot() +
  theme(
    axis.line = element_line(size = .3)
  )

# plotting relation between rt and condition (using wide-form data)
p4 <- ggplot(dat, aes(x = rt_word, y = rt_nonword, color = language)) +
  geom_point() +
  geom_smooth(method = "lm") +
  scale_colour_viridis_d(
    name = "Condition",
    labels = c("Monolingual", "Bilingual"),
    option = "E"
  ) +
  theme_cowplot() +
  theme(
    axis.line = element_line(size = .3)
  )

5.3 Representing Summary Statistics

# boxplot
ggplot(dat_long, aes(x = condition, y = acc, fill = language)) +
  geom_boxplot() +
  scale_fill_viridis_d(
    option = "E",
    name = "Group",
    labels = c("Bilingual", "Monolingual")
  ) +
  scale_x_discrete(
    name = "Condition",
    labels = c("Word", "Non-word")
    ) +
  scale_y_continuous(
    name = "Accuracy"
  ) +
  theme_cowplot() +
  theme(
    axis.line = element_line(size = .3)
  )

# violin plot
ggplot(dat_long, aes(x = condition, y = acc, fill = language)) +
  geom_violin() +
  scale_fill_viridis_d(
    option = "D",
    name = "Group",
    labels = c("Bilingual", "Monolingual")
  ) +
  scale_x_discrete(
    name = "Condition",
    labels = c("Word", "Non-word")
    ) +
  scale_y_continuous(
    name = "Accuracy"
  ) +
  theme_cowplot() +
  theme(
    axis.line = element_line(size = .3)
  )

# bar chart of means
ggplot(dat_long, aes(x = condition, y = rt)) +
  stat_summary(fun = "mean", geom = "bar", fill = "blue") +
  stat_summary(fun.data = "mean_se", geom = "errorbar", width = .2) +
  scale_y_continuous(
    expand = c(0, 0)
  ) +
  theme_minimal_hgrid(
    font_size = 11,
    line_size = .3
  ) +
  theme(
    axis.line.x.bottom = element_line(size = .3, color = "black"),
    axis.ticks = element_blank(),
    panel.grid = element_line(linetype = "dashed")
  )

# grouped violin-boxplot
ggplot(dat_long, aes(x = condition, y = rt, fill = language)) +
  geom_violin(
    alpha = .4
  ) +
  geom_boxplot(
    width = .2,
    fatten = NULL,
    position = position_dodge(.9),
    alpha = .4
  ) +
  stat_summary(fun = "mean", geom = "point", position = position_dodge(.9)) +
  stat_summary(fun.data = "mean_se", geom = "errorbar", width = .1, position = position_dodge(.9)) + 
  scale_y_continuous(
    expand = c(0, 0)
  ) +
  scale_fill_viridis_d(
    option = "E"
  ) +
  theme_minimal_hgrid(
    font_size = 11,
    line_size = .3
  ) +
  theme(
    axis.line.x.bottom = element_line(size = .3, color = "black"),
    axis.ticks = element_blank(),
    panel.grid = element_line(linetype = "dashed")
  )

# interaction plot
ggplot(dat_long, aes(x = condition, y = rt, shape = language, group = language, color = language)) +
  stat_summary(fun = "mean", geom = "point", size = 3) +
  stat_summary(fun = "mean", geom = "line") +
  stat_summary(fun.data = "mean_se", geom = "errorbar", width = .2) +
  scale_color_manual(
    values = c("blue", "darkorange")
  ) +
  theme_cowplot() +
  theme(
  axis.line = element_line(size = .3)
      )

# combined interaction plot
p3 <- ggplot(dat_long, aes(x = condition, y = rt, shape = language, group = language)) +
  geom_point(aes(color = language), alpha = .2) +
  geom_line(aes(group = id, color = language), alpha = .2) +
  stat_summary(fun = "mean", geom = "point", size = 2, color = "black") +
  stat_summary(fun = "mean", geom = "line", color = "black") +
  stat_summary(fun.data = "mean_se", geom = "errorbar", width = .2, color = "black") +
  theme_cowplot() +
  theme(
  axis.line = element_line(size = .3)
      )

5.4 Facets

# scatterplots
p1 <- ggplot(dat_long, aes(x = rt, y = age, color = condition)) +
  geom_point() +
  geom_smooth(method = "lm") +
  scale_colour_discrete(
    name = "Condition",
    labels = c("Word", "Non-word")
  ) +
  theme_cowplot() +
  theme(
    axis.line = element_line(size = .3)
  ) +
  facet_wrap(~condition)

# grouped violin-boxplot
p2 <- ggplot(dat_long, aes(x = condition, y = rt, fill = language)) +
  geom_violin(
    alpha = .4
  ) +
  geom_boxplot(
    width = .2,
    fatten = NULL,
    position = position_dodge(.9),
    alpha = .4
  ) +
  stat_summary(fun = "mean", geom = "point", position = position_dodge(.9)) +
  stat_summary(fun.data = "mean_se", geom = "errorbar", width = .1, position = position_dodge(.9)) + 
  scale_y_continuous(
    expand = c(0, 0)
  ) +
  scale_fill_viridis_d(
    option = "E"
  ) +
  theme_minimal_hgrid(
    font_size = 11,
    line_size = .3
  ) +
  theme(
    axis.line.x.bottom = element_line(size = .3, color = "black"),
    axis.ticks = element_blank(),
    panel.grid = element_line(linetype = "dashed")
  ) +
  facet_wrap(~factor(language,
                     levels = c("monolingual", "bilingual"),
                     labels = c("Monolingual participants", "Bilingual participants")
  )) +
  guides(fill = FALSE) # remove legend  
p2

# saving plots as images
# ggsave(filename = "grouped_violin_plots.png", plot = p1)

# arranging multiple plots
p1 + p2 # side-by-side

p1 / p2 # stacked

(p3 | p4) / p1 + p3 # multiple plots

# labeling axis with labs()
p3 + labs(
  x = "Type of word",
  y = "Reaction time (ms)",
  title = "Language group by word type interaction plot",
  subtitle = "Reaction time data"
)

5.5 Advanced Plots

theme_hgrid_config <- theme_minimal_hgrid(
    font_size = 11,
    line_size = .3
  ) +
  theme(
    axis.line.x.bottom = element_line(size = .3, color = "black"),
    axis.ticks = element_blank(),
    panel.grid = element_line(linetype = "dashed"))

# split-violin plots
ggplot(dat_long, aes(x = condition, y = rt, fill = language)) +
  introdataviz::geom_split_violin(alpha = .4, trim = FALSE) +
  geom_boxplot(width = .2, alpha = .6, show.legend = FALSE) +
  stat_summary(fun.data = "mean_se", geom = "pointrange", show.legend = F, 
               position = position_dodge(.175)) +
  scale_x_discrete(name = "Condition", labels = c("Non-word", "Word")) +
  scale_y_continuous(name = "Reaction time (ms)",
                     breaks = seq(200, 800, 100), 
                     limits = c(200, 800),
                     expand = c(0, 0)) +
  scale_fill_viridis_d(option = "E", name = "Language group") +
  theme_hgrid_config

# Raincloud plots
rain_height <- .1

ggplot(dat_long, aes(x = "", y = rt, fill = language)) +
  # clouds
  introdataviz::geom_flat_violin(trim=FALSE, alpha = 0.4,
    position = position_nudge(x = rain_height+.05)) +
  # rain
  geom_point(aes(colour = language), size = 2, alpha = .5, show.legend = FALSE, 
              position = position_jitter(width = rain_height, height = 0)) +
  # boxplots
  geom_boxplot(width = rain_height, alpha = 0.4, show.legend = FALSE, 
               outlier.shape = NA,
               position = position_nudge(x = -rain_height*2)) +
  # mean and SE point in the cloud
  stat_summary(fun.data = mean_se, mapping = aes(color = language), show.legend = FALSE,
               position = position_nudge(x = rain_height * 3)) +
  # adjust layout
  scale_x_discrete(name = "", expand = c(rain_height*3, 0, 0, 0.7)) +
  scale_y_continuous(name = "Reaction time (ms)",
                     breaks = seq(200, 800, 100), 
                     limits = c(200, 800)) +
  coord_flip() +
  facet_wrap(~factor(condition, 
                     levels = c("word", "nonword"), 
                     labels = c("Word", "Non-Word")), 
             nrow = 2) +
  # custom colours and theme
  scale_fill_viridis_d(option = "E", name = "Language group") +
  scale_colour_viridis_d(option  ="E") +
  theme_minimal() +
  theme(panel.grid.major.y = element_blank(),
        legend.position = c(0.8, 0.8),
        legend.background = element_rect(fill = "white", color = "white"),
        panel.grid = element_line(linetype = "dashed"))

# Ridge plots
# read in data from Nation et al. 2017
data <- read_csv("https://raw.githubusercontent.com/zonination/perceptions/master/probly.csv")

# convert to long format and percents
long <- pivot_longer(data, cols = everything(),
                     names_to = "label",
                     values_to = "prob") %>%
  mutate(label = factor(label, names(data), names(data)),
         prob = prob/100)

# ridge plot
ggplot(long, aes(x = prob, y = label, fill = label)) + 
  ggridges::geom_density_ridges(scale = 2, show.legend = FALSE) +
  scale_x_continuous(name = "Assigned Probability", 
                     limits = c(0, 1.1), labels = scales::percent,
                     expand = c(0, 0)
                     ) +
  # control space at top and bottom of plot
  scale_y_discrete(name = "", expand = c(0.02, 0, .08, 0)) +
  theme_dviz_vgrid() +
  theme(
    panel.grid = element_line(size = .3, linetype = "dashed"),
    panel.border = element_blank(),
    axis.ticks.y = element_blank()
  )

# Alluvial plots
# simulate data for 4 years of grades from 500 students
# with a correlation of 0.75 from year to year
# and a slight increase each year
dat <- faux::sim_design(
  within = list(year = c("Y1", "Y2", "Y3", "Y4")),
  n = 500,
  mu = c(Y1 = 0, Y2 = .2, Y3 = .4, Y4 = .6), r = 0.75, 
  dv = "grade", long = TRUE, plot = FALSE) %>%
  # convert numeric grades to letters with a defined probability
  mutate(grade = faux::norm2likert(grade, prob = c("3rd" = 5, "2.2" = 10, "2.1" = 40, "1st" = 20)),
         grade = factor(grade, c("1st", "2.1", "2.2", "3rd"))) %>%
  # reformat data and count each combination
  tidyr::pivot_wider(names_from = year, values_from = grade) %>%
  dplyr::count(Y1, Y2, Y3, Y4)

# plot data with colours by Year1 grades
ggplot(dat, aes(y = n, axis1 = Y1, axis2 = Y2, axis3 = Y3, axis4 = Y4)) +
  geom_alluvium(aes(fill = Y4), width = 1/6) +
  geom_stratum(fill = "grey", width = 1/3, color = "black") +
  geom_label(stat = "stratum", aes(label = after_stat(stratum))) +
  scale_fill_viridis_d(name = "Final Classification") +
  theme_minimal() +
  theme(legend.position = "top")

6 Tutorials

6.1 Labelling Bar Graphs in ggplot2

6.1.1 Data preparation

mpg_sum <- mpg |>
dplyr::filter(year == 2008) |>
dplyr::mutate(
  # capitalize first letter
  manufacturer = stringr::str_to_title(manufacturer),
  # turn into lumped factors with capitalized names
  manufacturer = forcats::fct_lump(manufacturer, n = 10)
) |>
# count and sort ocurrences
dplyr::count(manufacturer, sort = TRUE) |>
dplyr::mutate(
  #  order factor levels by number, put "Other" to end
  manufacturer = forcats::fct_rev(forcats::fct_inorder(manufacturer)),
  manufacturer = forcats::fct_relevel(manufacturer, "Other", after = 0)
)
# we have reversed the ordering since {ggplot2} plots factors from bottom to top when being mapped to y
mpg_sum
manufacturer n
Dodge 21
Toyota 14
Chevrolet 12
Volkswagen 11
Other 11
Ford 10
Audi 9
Hyundai 8
Subaru 8
Nissan 7
Jeep 6

6.1.2 Data visualization with ggplot2

# plotting the basic bar plot
ggplot(mpg_sum, aes(x = n, y = manufacturer)) +
  geom_col(fill = "gray70") +
  theme_minimal()

# calculate percentages creating a temp df
# option 1: using sprintf() to create percentage labels
mpg_sum <- mpg_sum |> 
  dplyr::mutate(
    perc = paste0(sprintf("%4.1f", n / sum(n) * 100), "%")
  )
mpg_sum
manufacturer n perc
Dodge 21 17.9%
Toyota 14 12.0%
Chevrolet 12 10.3%
Volkswagen 11 9.4%
Other 11 9.4%
Ford 10 8.5%
Audi 9 7.7%
Hyundai 8 6.8%
Subaru 8 6.8%
Nissan 7 6.0%
Jeep 6 5.1%
# option 2: using the percent() from the scales package
# mpg_sum <- mpg_sum |> 
#   dplyr::mutate(
#     perc = scales::percent(n / sum(n), accuracy = .1, trim = FALSE)
#   )
# mpg_sum

# adding the percentage label
ggplot(mpg_sum, aes(x = n, y = manufacturer)) +
  geom_col(fill = "gray70") +
  geom_text(aes(label = perc)) +
  theme_minimal()

# adding some description to one of the bars
mpg_sum <- mpg_sum |> 
  dplyr::mutate(
    perc = paste0(sprintf("%4.1f", n / sum(n) * 100), "%"),
    perc = if_else(row_number() == 1, paste(perc, "of all car models"), perc)
  )

ggplot(mpg_sum, aes(x = n, y = manufacturer)) +
  geom_col(fill = "gray70") +
  geom_text(aes(label = perc)) +
  theme_minimal()

# example of creating and placing labels on the fly
# prepare non-aggregated data set with lumped and ordered factors
# mpg_fct <- mpg %>%
#   dplyr::filter(year == 2008) %>%
#   dplyr::mutate(
#     # add count to calculate percentages later
#     total = dplyr::n(),
#     # turn into lumped factors with capitalized names
#     manufacturer = stringr::str_to_title(manufacturer),
#     manufacturer = forcats::fct_lump(manufacturer, n = 10),
#     # order factor levels by number, put "Other" to end
#     manufacturer = forcats::fct_rev(forcats::fct_infreq(manufacturer)),
#     manufacturer = forcats::fct_relevel(manufacturer, "Other", after = 0)
#   )
# mpg_fct
# 
# ggplot(mpg_fct, aes(x = manufacturer)) +
#   geom_bar(fill = "gray70") +
#   # add count labels
#   geom_text(
#     stat = "count",
#     aes(label = ..count..)
#   ) +
#   # rotate plot
#   coord_flip() +
#   theme_minimal()

# locating labels inside the bars
ggplot(mpg_sum, aes(x = n, y = manufacturer)) +
  geom_col(fill = "gray70") +
  geom_text(aes(label = perc),
    hjust = 1,
    nudge_x = -.5
  ) +
  theme_minimal()

# In case you want to put the next to the bars, you often need to adjust the plot margin and/or the limits to avoid that the labels are cut off. The drawback of using limits is that you have to define them manually.You can make sure that labels are not truncated by the panel by adding clip = "off" to any coordinate system.

# adding colors to the bars using different hues

# option 1: create color palette based on input data
pal <- c(
  "gray85",
  # use the length of the manufacturer column for all non-highlighted bars and subtract the number of bars we want to highlight
  rep("gray70", length(mpg_sum$manufacturer) - 4), 
  "coral2", "mediumpurple1", "goldenrod1"
)

ggplot(mpg_sum, aes(x = n, y = manufacturer, fill = manufacturer)) +
  geom_col() +
  geom_text(aes(label = perc),
    hjust = 1,
    nudge_x = -.5
  ) +
  # add custom colors
  scale_fill_manual(values = pal, guide = "none") +
  theme_minimal()

# option 2: add the color to the data set and map the fill to that column and use scale_fill_identity()
# this option will work also if the data were updated!
mpg_sum <- mpg_sum  |>
mutate(
  color = case_when(
    row_number() == 1 ~ "goldenrod1",
    row_number() == 2 ~ "mediumpurple1",
    row_number() == 3 ~ "coral2",
    manufacturer == "Other" ~ "gray85",
    # all others should be gray
    TRUE ~ "gray70"
  )
)

ggplot(mpg_sum, aes(x = n, y = manufacturer, fill = color)) +
  geom_col() +
  geom_text(
    aes(label = perc),
    hjust = 1, nudge_x = -.5
  ) +
  # add custom colors
  scale_fill_identity(guide = "none") +
  theme_minimal()

# some polishing
ggplot(mpg_sum, aes(x = n, y = manufacturer, fill = color)) +
  geom_col() +
  geom_text(
    aes(label = perc),
    hjust = 1, nudge_x = -.5,
    size = 3.5, fontface = "bold", family = "Fira Sans"
  ) +
  scale_x_continuous(expand = c(.01, .01)) +
  # add custom colors
  scale_fill_identity(guide = "none") +
  theme_void() +
  theme(
    axis.text.y = element_text(size = 14, hjust = 1, family = "Fira Sans"),
    plot.margin = margin(rep(15, 4))
  )

# adding label boxes for accessibility
ggplot(mpg_sum, aes(x = n, y = manufacturer, fill = color)) +
  geom_col() +
  geom_label(
    aes(label = perc),
    hjust = 1, nudge_x = -.5,
    size = 3.5, fontface = "bold", family = "Fira Sans",
    fill = "white", label.size = 0
  ) +
  scale_x_continuous(expand = c(.01, .01)) +
  # add custom colors
  scale_fill_identity(guide = "none") +
  theme_void() +
  theme(
    axis.text.y = element_text(size = 14, hjust = 1, family = "Fira Sans"),
    plot.margin = margin(rep(15, 4))
  )

# with a different label placement
mpg_sum2 <- mpg_sum |>
  mutate(
  # set justification based on data
  # so that only the first label is placed inside
  place = if_else(row_number() == 1, 1, 0),
  # add some spacing to labels since we cant use nudge_x anymore
  perc = paste(" ", perc, " ")
)
mpg_sum2
manufacturer n perc color place
Dodge 21 17.9% of all car models goldenrod1 1
Toyota 14 12.0% mediumpurple1 0
Chevrolet 12 10.3% coral2 0
Volkswagen 11 9.4% gray70 0
Other 11 9.4% gray85 0
Ford 10 8.5% gray70 0
Audi 9 7.7% gray70 0
Hyundai 8 6.8% gray70 0
Subaru 8 6.8% gray70 0
Nissan 7 6.0% gray70 0
Jeep 6 5.1% gray70 0
ggplot(mpg_sum2, aes(x = n, y = manufacturer, fill = color)) +
  geom_col() +
  geom_text(
    aes(label = perc, hjust = place),
    size = 4, fontface = "bold", family = "Fira Sans"
  ) +
  scale_x_continuous(expand = c(.01, .01)) +
  scale_fill_identity(guide = "none") +
  theme_void() +
  theme(
    axis.text.y = element_text(size = 14, hjust = 1, family = "Fira Sans"),
    plot.margin = margin(rep(15, 4))
  )

6.2 Tables

Tables are a form of data visualization. If you want to show the exact amount of every value in your data, a table might be your best solution. But tables are especially susceptible to clutter. Anatomy of a Table The ten guidelines of better tables:

  • Rule 1. Offset the headers from body
  • Rule 2. Use subtle dividers instead of heavy gridlines
  • Rule 3. Right-align numbers and headers
  • Rule 4. Left-align text and header
  • Rule 5. Select the appropriate level of precision
  • Rule 6. Guide your reader with space between rows and columns
  • Rule 7. Remove unit repetition
  • Rule 8. Highlight outliers
  • Rule 9. Group similar data and increase white space
  • Rule 10. Add visualizations when appropriate
  1. Rule 1. Offset the headers from body

Make your column titles clear. Try using boldface text or lines to offset them from the numbers and text in the body of the table. Offset the headers from body 2. Rule 2. Use subtle dividers instead of heavy gridlines

For series that show the total, use shading, boldface, or subtle line breakers to distinguish these. Use subtle dividers 3. Rule 3. Right-align numbers and headers

Right-align numbers along the decimal place or comma. You might need to add zeros to maintain the alignment, but it’s worth it so the numbers are easier to read and scan. Always use fonts that have “lining numbers,” where all the numerals hit the baseline, and none drop below it. Right-align numbers 4. Rule 4. Left-align text and header

Once we’ve right-aligned the numbers, we should left-align the text. Left-align text 5. Rule 5. Select the appropriate level of precision

Precision to the fifth-decimal place is almost never necessary. Strike a balance between necessary precision and a clean, spare table. Precision 6. Rule 6. Guide your reader with space between rows and columns

Your use of space in and around the table can influence the direction in which your reader reads the data. In the table on the left, for example, there is more space between the columns than between the rows, so your eye is drawn to read the table top-to-bottom rather than left-to-right. By comparison, the table on the right has more space between the rows than between the columns, so your eye is more likely to track horizontally rather than vertically. Use spacing strategically to match the order in which you want your reader to take in the table. Space 7. Rule 7. Remove unit repetition

Your reader knows that the values in your table are dollars because you told them in the title or subtitle. Repeating the symbol throughout the table is overkill and cluttering. Use the title or column title area to define the units, or place them in the first row only (remembering to align the numbers along the decimal). If you are mixing units within the table, be sure to make your labels clear. Remove repition 8. Rule 8. Highlight outliers

If we want to point out some observations, we might want to highlight outlier values by making the text boldface, shading it with color, or even shading the entire cell. Some readers will wade through all of the numbers in the table because they need specific information, but many readers are more likely to look for only the most important values. Highlight outliers 9. Rule 9. Group similar data and increase white space

Reduce repetition by grouping similar data or labels. Similar to eliminating dollars signs on every number value, we can reduce some of the clutter in our tables by grouping like terms or labels. In this next example, grouping the names of the country regions reduces the amount of repetitive information in the first column. You can also use spanner headers and rules to combine the same entry and reduce unnecessary repetition. Group similar data 10. Rule 10. Add visualizations when appropriate

Just like highlighting outliers with color or boldface, you might add sparklines to visualize some data rather than showing every number. Or you can use small bar charts to visually illustrate a series of numbers. Or you could use a heatmap and leave the numbers in the table or hide them, which can help the reader focus on the overall patterns and ignore the details. We can also embed a chart-type structure right into our table. If you want a full chart embedded within the table, a dot plot is succinct and can line up well within the linear structure of a table. You can also use a modification on the standard dot plot to place the numbers in their relative positions directly in a table. Highlight outliers

7 Data Science for Psychologists (Hansjörg Neth)

7.1 Using colors in R

unikn::seecol(c(Seeblau, "deepskyblue"))

8 Linting

The code in this RMarkdown is linted with the lintr package, which is based on the tidyverse style guide.

# lintr::lint("main.Rmd", linters =
#               lintr::with_defaults(
#                 commented_code_linter = NULL,
#                 trailing_whitespace_linter = NULL
#                 )
#             )
# # if you have additional scripts and want them to be linted too, add them here
# lintr::lint("scripts/my_script.R")